Method and system for lithography process-window-maximizing optical proximity correction

ABSTRACT

An efficient OPC method of increasing imaging performance of a lithographic process utilized to image a target design having a plurality of features. The method includes determining a function for generating a simulated image, where the function accounts for process variations associated with the lithographic process; and optimizing target gray level for each evaluation point in each OPC iteration based on this function. In one given embodiment, the function is approximated as a polynomial function of focus and exposure, R(ε,ƒ)=P0+ƒ2·Pb with a threshold of T+Vε for contours, where PO represents image intensity at nominal focus, ƒ represents the defocus value relative to the nominal focus, ε represents the exposure change, V represents the scaling of exposure change, and parameter “Pb” represents second order derivative images. In another given embodiment, the analytical optimal gray level is given for best focus with the assumption that the probability distribution of focus and exposure variation is Gaussian.

The present application is a continuation of U.S. patent applicationSer. No. 12/642,436, filed Dec. 18, 2009, which claims priority to U.S.Provisional Patent Application No. 61/138,865, filed on Dec. 18, 2008,the entire contents of each of the foregoing applications isincorporated herein by reference.

FIELD

The present invention relates generally to a method and program productfor performing simulation and the optical proximity correction of theimaging results associated with a lithography process, and morespecifically to a computationally efficient optical proximity correction(OPC) method of optimizing imaging performance of a lithographicapparatus that accounts for parameter variations over a process window.

BACKGROUND

Lithographic apparatus can be used, for example, in the manufacture ofintegrated circuits (ICs). In such a case, the mask may contain acircuit pattern corresponding to an individual layer of the IC, and thispattern can be imaged onto a target portion (e.g. comprising one or moredies) on a substrate (silicon wafer) that has been coated with a layerof radiation-sensitive material (resist). In general, a single waferwill contain a whole network of adjacent target portions that aresuccessively irradiated via the projection system, one at a time. In onetype of lithographic projection apparatus, each target portion isirradiated by exposing the entire mask pattern onto the target portionin one go; such an apparatus is commonly referred to as a wafer stepper.In an alternative apparatus, commonly referred to as a step-and-scanapparatus, each target portion is irradiated by progressively scanningthe mask pattern under the projection beam in a given referencedirection (the “scanning” direction) while synchronously scanning thesubstrate table parallel or anti-parallel to this direction. Since, ingeneral, the projection system will have a magnification factor M(generally <1), the speed V at which the substrate table is scanned willbe a factor M times that at which the mask table is scanned. Moreinformation with regard to lithographic devices as described herein canbe gleaned, for example, from U.S. Pat. No. 6,046,792, incorporatedherein by reference.

In a manufacturing process using a lithographic projection apparatus, amask pattern is imaged onto a substrate that is at least partiallycovered by a layer of radiation-sensitive material (resist). Prior tothis imaging step, the substrate may undergo various procedures, such aspriming, resist coating and a soft bake. After exposure, the substratemay be subjected to other procedures, such as a post-exposure bake(PEB), development, a hard bake and measurement/inspection of the imagedfeatures. This array of procedures is used as a basis to pattern anindividual layer of a device, e.g., an IC. Such a patterned layer maythen undergo various processes such as etching, ion-implantation(doping), metallization, oxidation, chemo-mechanical polishing, etc.,all intended to finish off an individual layer. If several layers arerequired, then the whole procedure, or a variant thereof, will have tobe repeated for each new layer. Eventually, an array of devices will bepresent on the substrate (wafer). These devices are then separated fromone another by a technique such as dicing or sawing, whence theindividual devices can be mounted on a carrier, connected to pins, etc.

For the sake of simplicity, the projection system may hereinafter bereferred to as the “lens”; however, this term should be broadlyinterpreted as encompassing various types of projection systems,including refractive optics, reflective optics, and catadioptricsystems, for example. The radiation system may also include componentsoperating according to any of these design types for directing, shapingor controlling the projection beam of radiation, and such components mayalso be referred to below, collectively or singularly, as a “lens”.Further, the lithographic apparatus may be of a type having two or moresubstrate tables (and/or two or more mask tables). In such “multiplestage” devices the additional tables may be used in parallel, orpreparatory steps may be carried out on one or more tables while one ormore other tables are being used for exposures. Twin stage lithographicapparatus are described, for example, in U.S. Pat. No. 5,969,441,incorporated herein by reference.

The photolithographic masks referred to above comprise geometricpatterns corresponding to the circuit components to be integrated onto asilicon wafer. The patterns used to create such masks are generatedutilizing CAD (computer-aided design) programs, this process often beingreferred to as EDA (electronic design automation). Most CAD programsfollow a set of predetermined design rules in order to create functionalmasks. These rules are set by processing and design limitations. Forexample, design rules define the space tolerance between circuit devices(such as gates, capacitors, etc.) or interconnect lines, so as to ensurethat the circuit devices or lines do not interact with one another in anundesirable way. The design rule limitations are typically referred toas “critical dimensions” (CD). A critical dimension of a circuit can bedefined as the smallest width of a line or hole or the smallest spacebetween two lines or two holes. Thus, the CD determines the overall sizeand density of the designed circuit. Of course, one of the goals inintegrated circuit fabrication is to faithfully reproduce the originalcircuit design on the wafer (via the mask).

As noted, microlithography is a central step in the manufacturing ofsemiconductor integrated circuits, where patterns formed onsemiconductor wafer substrates define the functional elements ofsemiconductor devices, such as microprocessors, memory chips etc.Similar lithographic techniques are also used in the formation of flatpanel displays, micro-electro mechanical systems (MEMS) and otherdevices.

As semiconductor manufacturing processes continue to advance, thedimensions of circuit elements have continually been reduced while theamount of functional elements, such as transistors, per device has beensteadily increasing over decades, following a trend commonly referred toas ‘Moore's law’. At the current state of technology, critical layers ofleading-edge devices are manufactured using optical lithographicprojection systems known as scanners that project a mask image onto asubstrate using illumination from a deep-ultraviolet laser light source,creating individual circuit features having dimensions well below 100nm, i.e. less than half the wavelength of the projection light.

This process, in which features with dimensions smaller than theclassical resolution limit of an optical projection system are printed,is commonly known as low-k_(l) lithography, according to the resolutionformula CD=k_(l)×λ/NA, where λ is the wavelength of radiation employed(currently in most cases 248 nm or 193 nm), NA is the numerical apertureof the projection optics, CD is the ‘critical dimension’—generally thesmallest feature size printed—and k_(l) is an empirical resolutionfactor. In general, the smaller k_(l), the more difficult it becomes toreproduce a pattern on the wafer that resembles the shape and dimensionsplanned by a circuit designer in order to achieve particular electricalfunctionality and performance. To overcome these difficulties,sophisticated fine-tuning steps are applied to the projection system aswell as to the mask design. These include, for example, but not limitedto, optimization of NA and optical coherence settings, customizedillumination schemes, use of phase shifting masks, optical proximitycorrection in the mask layout, or other methods generally defined as‘resolution enhancement techniques’ (RET).

As one important example, optical proximity correction (OPC, sometimesalso referred to as ‘optical and process correction’) addresses the factthat the final size and placement of a printed feature on the wafer willnot simply be a function of the size and placement of the correspondingfeature on the mask. It is noted that the terms ‘mask’ and ‘reticle’ areutilized interchangeably herein. For the small feature sizes and highfeature densities present on typical circuit designs, the position of aparticular edge of a given feature will be influenced to a certainextent by the presence or absence of other adjacent features. Theseproximity effects arise from minute amounts of light coupled from onefeature to another. Similarly, proximity effects may arise fromdiffusion and other chemical effects during post-exposure bake (PEB),resist development, and etching that generally follow lithographicexposure.

In order to ensure that the features are generated on a semiconductorsubstrate in accordance with the requirements of the given targetcircuit design, proximity effects need to be predicted utilizingsophisticated numerical models, and corrections or pre-distortions needto be applied to the design of the mask before successful manufacturingof high-end devices becomes possible. The article “Full-Chip LithographySimulation and Design Analysis—How OPC Is Changing IC Design”, C.Spence, Proc. SPIE, Vol. 5751, pp 1-14 (2005) provides an overview ofcurrent ‘model-based’ optical proximity correction processes. In atypical high-end design almost every feature edge requires somemodification in order to achieve printed patterns that come sufficientlyclose to the target design. These modifications may include shifting orbiasing of edge positions or line widths as well as application of‘assist’ features that are not intended to print themselves, but willaffect the properties of an associated primary feature.

The application of model-based OPC to a target design requires goodprocess models and considerable computational resources, given the manymillions of features typically present in a chip design. However,applying OPC is generally not an ‘exact science’, but an empirical,iterative process that does not always resolve all possible weaknesseson a layout. Therefore, post-OPC designs, i.e. mask layouts afterapplication of all pattern modifications by OPC and any other RET's,need to be verified by design inspection, i.e. intensive full-chipsimulation using calibrated numerical process models, in order tominimize the possibility of design flaws being built into themanufacturing of a mask set. This is driven by the enormous cost ofmaking high-end mask sets, which run in the multi-million dollar range,as well as by the impact on turn-around time by reworking or repairingactual masks once they have been manufactured.

Both OPC and full-chip RET verification may be based on numericalmodeling systems and methods as described, for example in, U.S. Pat. No.7,003,758 and an article titled “Optimized Hardware and Software ForFast, Full Chip Simulation”, by Y. Cao et al., Proc. SPIE, Vol. 5754,405 (2005).

While full-chip numerical simulation of the lithographic patterningprocess has been demonstrated at a single process condition, typicallybest focus and best exposure dose or best ‘nominal’ condition, it iswell known that manufacturability of a design requires sufficienttolerance of pattern fidelity against small variations in processconditions that are unavoidable during actual manufacturing. Thistolerance is commonly expressed as a process window, defined as thewidth and height (or ‘latitude’) in exposure-defocus space over which CDor edge placement variations are within a predefined margin (i.e., errortolerance), for example ±10% of the nominal line width. In practice, theactual margin requirement may differ for different feature types,depending on their function and criticality. Furthermore, the processwindow concept can be extended to other basis parameters in addition toor besides exposure dose and defocus.

Manufacturability of a given design generally depends on the commonprocess window of all features in a single layer. While state-of-the-artOPC application and design inspection methods are capable of optimizingand verifying a design at nominal conditions, it has been recentlyobserved that process-window aware OPC models will be required in orderto ensure manufacturability at future process nodes due toever-decreasing tolerances and CD requirements.

Currently, in order to map out the process window of a given design withsufficient accuracy and coverage, simulations at N parameter settings(e.g., defocus and exposure dose) are required, where N can be on theorder of a dozen or more. Consequently, an N-fold multiplication ofcomputation time is necessary if these repeated simulations at varioussettings are directly incorporated into the framework of an OPCapplication and verification flow, which typically will involve a numberof iterations of full-chip lithography simulations. However, such anincrease in the computational time is prohibitive when attempting tovalidate and/or design a given target circuit.

As such, there is a need for simulation methods and systems whichaccount for variations in the process-window that can be used for OPCand RET verification, and that are more computationally efficient thansuch a ‘brute-force’ approach of repeated simulation at variousconditions as is currently performed by known prior art systems.

PCT Patent App. No. US2009/049792 describes an efficient methodology fordense process window simulation and addresses the need forprocess-window aware RET verification. What remains desirable is toobtain a methodology for OPC based on the efficient dense process windowsimulation to maximize the process window for full chip design.

SUMMARY

Accordingly, the present invention relates to a computationallyefficient OPC method which maximizes the process window for use in asimulation process, and which overcomes the foregoing deficiencies ofthe prior art techniques.

More specifically, the present invention relates to an efficient OPCmethod of increasing performance of a lithographic process utilized toimage a target design having a plurality of features. The methodincludes the steps of determining a function for generating a simulatedimage, where the function accounts for process variations associatedwith the lithographic process; and optimizing target gray level for eachevaluation point in each OPC iteration based on this function. In onegiven embodiment, the function is approximated as a polynomial functionof focus and exposure. In another given embodiment, the analyticaloptimal gray level is given for best focus with the assumption that theprobability distribution of focus and exposure variation is Gaussian.

The present invention provides significant advantages over prior artmethods. Most importantly, the present invention provides acomputationally efficient OPC approach which maximizes the processwindow (e.g., achieves maximum tolerance for focus variations andexposure dose variations), and eliminates the need to perform the‘brute-force’ approach of repeated simulation at various conditions asis currently practiced by known prior art methods. Indeed, as furthernoted below, for a Process Window aware OPC with N process windowconditions to optimize, the computation time of the present method isapproximately 2T, whereas the prior art method would requireapproximately NT, where T denotes the computation time required for OPCat one process window condition.

The method is also readily applied to other applications such as, butnot limited to, model calibration; lithography design inspection; yieldestimates based on evaluation of common process windows; identificationof hot spots (or problem spots) and correction of such hot-spots byutilizing process window aware OPC; and model-based process controlcorrections (e.g., to center the common process window for a givenlithography layer in the lithography process).

In an aspect of the invention, a method of maximizing a process windowassociated with a given lithography process is disclosed. The methodincludes the steps of computing an analytical function of processcondition parameters that approximates a resist image value across aprocess window for each of a plurality of evaluation points in thetarget pattern; determining a target value of the resist image value atnominal condition for each evaluation point based on the analyticalfunction, so that the process window is maximized; and using the targetvalue as an optimizing target for each evaluation point in an opticalproximity correction iteration.

In another aspect of the invention, a method for maximizing a processwindow associated with a given lithography process is disclosed. Themethod includes the steps of determining an acceptable variation ofresist image values around a fixed threshold for a plurality ofevaluation points in the target pattern; computing an optimal targetvalue within the acceptable variation of resist image values for eachevaluation point for a nominal process window condition, so that theprocess parameter variation range is maximal subject to the conditionthat the resist image value is kept within its acceptable variation;performing an edge movement process in an iterative manner in an opticalproximity correction process until an approximated resist image value ateach evaluation point converges to the optimal target value.

In another aspect of the invention, a method for maximizing a processwindow associated with a given lithography process is disclosed. Themethod includes the steps of determining an optimal target gray level ofa resist image value for each of a plurality of evaluation points in thetarget pattern based on an analytical function so as to maximize theprocess window; using the target gray level value as the optimizationtarget for the resist image value for each evaluation point in anoptical proximity correction iteration; and determining the best edgemovement amount of the optical proximity correction iteration so thatthe resulting resist image value equals to the target gray level.

In another aspect of the invention, a computer program product havingcomputer-executable instructions for causing a computer to maximizelithographic process window for a target pattern, the instructionscausing the computer to perform a method, is disclosed. The methodincludes the steps of computing an analytical function of the processparameters that approximates a resist image value across a processwindow for each of a plurality of evaluation points in the targetpattern; determining a target value of the resist image value at thenominal process condition for each evaluation point based on theanalytical function, so that the process window is maximized; and usingthe target value as an optimizing target for each evaluation point in anoptical proximity correction iteration.

In another aspect of the invention, a device manufactured according to amethod to maximize lithographic process window for a target pattern isdisclosed. The method includes the steps of computing an analyticalfunction of the process parameters that approximates a resist imagevalue across a process window for each of a plurality of evaluationpoints in a target pattern; determining a target value of the resistimage value at the nominal process condition for each evaluation pointbased on the analytical function, so that the process window ismaximized; using the target value as an optimizing target for eachevaluation point in an optical proximity correction iteration; andimaging the target pattern after one or more of the optical proximitycorrection iterations using a lithographic apparatus.

In another aspect of the invention, a method for maximizing a processwindow associated with a given lithography process is disclosed. Themethod includes the steps of determining an optimal target gray level ofa resist image value at a nominal process condition for each of aplurality of evaluation points in the target pattern based on ananalytical function so as to maximize the process window for a givennominal condition; performing an edge movement process in an iterativemanner in an optical proximity correction process until an approximatedresist image value at each evaluation point converges to the optimaltarget gray level value at the nominal process condition; determiningthe optimal nominal condition for the resulting resist image from theoptical proximity correction so as to maximize the process window; andalternatively re-performing the determination of the optimal target graylevel, the optical proximity correction, and the determination of theoptimal nominal condition until convergence on an optimal targetpattern.

The invention itself, together with further objects and advantages, canbe better understood by reference to the following detailed descriptionand the accompanying schematic drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an exemplary block diagram illustrating a typical lithographicprojection system.

FIG. 2 is an exemplary block diagram illustrating the functional modulesof a lithographic simulation model.

FIG. 3 illustrates an exemplary flowchart of a first embodiment of thepresent invention.

FIG. 4 illustrates an exemplary flowchart of a second embodiment of thepresent invention.

FIG. 5 illustrates an exemplary flowchart of a third embodiment of thepresent invention.

FIG. 6 shows a graphical example of the object functions to identifyoptimal P₀ for τ≤1 (τ=0.75 in this example) in accordance with an aspectof the present invention.

FIG. 7 shows a graphical example of the object functions to identifyoptimal P₀ for τ>1 (τ=2 in this example) in accordance with an aspect ofthe present invention.

FIG. 8 shows a graphical interpretation for the optimal P₀ for τ≤1(τ=0.75 in this example) in accordance with an aspect of the presentinvention.

FIG. 9 shows a graphical interpretation for the optimal P₀ for τ>1 (τ=2in this example) in accordance with an aspect of the present invention.

FIG. 10 shows a graphical interpretation for (ƒ(P),ε(P₀)) when P₀≥T₁=Kin accordance with an aspect of the present invention.

FIG. 11 shows a graphical interpretation for (ƒ(P),ε(P₀)) when P₀≤T₂=K.Note that in this example, there are three real roots to (Eq.49), inaccordance with an aspect of the present invention.

FIG. 12 shows an exemplary flow diagram of a bisection method to findoptimal P₀ for each evaluation point in accordance with an aspect of thepresent invention.

FIG. 13 shows an exemplary high-level flow diagram of PWM-OPC inaccordance with an aspect of the present invention.

FIG. 14 shows a high-level flow diagram of OPC combined with NominalCondition Optimization in accordance with an aspect of the presentinvention.

FIG. 15 is a block diagram that illustrates a computer system which canassist in the implementation of the simulation method of the presentinvention.

FIG. 16 schematically depicts a lithographic projection apparatussuitable for use with the method of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention will now be described in detail with reference tothe drawings, which are provided as illustrative examples of theinvention so as to enable those skilled in the art to practice theinvention. Notably, the figures and examples below are not meant tolimit the scope of the present invention to a single embodiment, butother embodiments are possible by way of interchange of some or all ofthe described or illustrated elements. Moreover, where certain elementsof the present invention can be partially or fully implemented usingknown components, only those portions of such known components that arenecessary for an understanding of the present invention will bedescribed, and detailed descriptions of other portions of such knowncomponents will be omitted so as not to obscure the invention.

Embodiments described as being implemented in software should not belimited thereto, but can include embodiments implemented in hardware, orcombinations of software and hardware, and vice-versa, as will beapparent to those skilled in the art, unless otherwise specified herein.In the present specification, an embodiment showing a singular componentshould not be considered limiting; rather, the invention is intended toencompass other embodiments including a plurality of the same component,and vice-versa, unless explicitly stated otherwise herein. Moreover,applicants do not intend for any term in the specification or claims tobe ascribed an uncommon or special meaning unless explicitly set forthas such. Further, the present invention encompasses present and futureknown equivalents to the known components referred to herein by way ofillustration.

Although specific reference may be made in this text to the use of theinvention in the manufacture of ICs, it should be explicitly understoodthat the invention has many other possible applications. For example, itmay be employed in the manufacture of integrated optical systems,guidance and detection patterns for magnetic domain memories,liquid-crystal display panels, thin-film magnetic heads, etc. Theskilled artisan will appreciate that, in the context of such alternativeapplications, any use of the terms “reticle”, “wafer” or “die” in thistext should be considered as being replaced by the more general terms“mask”, “substrate” and “target portion”, respectively.

In the present document, the terms “radiation” and “beam” are used toencompass all types of electromagnetic radiation, including ultravioletradiation (e.g. with a wavelength of 365, 248, 193, 157 or 126 nm) andEUV (extreme ultra-violet radiation, e.g. having a wavelength in therange 5-20 nm).

The term mask as employed in this text may be broadly interpreted asreferring to generic patterning means that can be used to endow anincoming radiation beam with a patterned cross-section, corresponding toa pattern that is to be created in a target portion of the substrate;the term “light valve” can also be used in this context. Besides theclassic mask (transmissive or reflective; binary, phase-shifting,hybrid, etc.), examples of other such patterning means include:

-   -   a programmable mirror array. An example of such a device is a        matrix-addressable surface having a viscoelastic control layer        and a reflective surface. The basic principle behind such an        apparatus is that (for example) addressed areas of the        reflective surface reflect incident light as diffracted light,        whereas unaddressed areas reflect incident light as undiffracted        light. Using an appropriate filter, the said undiffracted light        can be filtered out of the reflected beam, leaving only the        diffracted light behind; in this manner, the beam becomes        patterned according to the addressing pattern of the        matrix-addressable surface. The required matrix addressing can        be performed using suitable electronic means. More information        on such mirror arrays can be gleaned, for example, from U.S.        Pat. Nos. 5,296,891 and 5,523,193, which are incorporated herein        by reference.    -   a programmable LCD array. An example of such a construction is        given in U.S. Pat. No. 5,229,872, which is incorporated herein        by reference.

Prior to discussing the present invention, a brief discussion regardingthe overall simulation and imaging process is provided. FIG. 1illustrates an exemplary lithographic projection system 10. The majorcomponents are a light source 12, which may be a deep-ultravioletexcimer laser source, illumination optics which define the partialcoherence (denoted as sigma) and which may include specific sourceshaping optics 14, 16 a and 16 b; a mask or reticle 18; and projectionoptics 16 c that produce an image of the reticle pattern onto the waferplane 22. An adjustable filter or aperture 20 at the pupil plane mayrestrict the range of beam angles that impinge on the wafer plane 22,where the largest possible angle defines the numerical aperture of theprojection optics NA=sin(λ_(max)).

In a lithography simulation system, these major system components can bedescribed by separate functional modules, for example, as illustrated inFIG. 2. Referring to FIG. 2, the functional modules include the designlayout module 26, which defines the target design; the mask layoutmodule 28, which defines the mask to be utilized in imaging process; themask model module 30, which defines the model of the mask layout to beutilized during the simulation process; the optical model module 32,which defines the performance of the optical components of lithographysystem; and the resist model module 34, which defines the performance ofthe resist being utilized in the given process. As is known, the resultof the simulation process produces, for example, predicted contours andCDs in the result module 36.

More specifically, it is noted that the properties of the illuminationand projection optics are captured in the optical model 32 thatincludes, but not limited to, NA-sigma (σ) settings as well as anyparticular illumination source shape. The optical properties of thephoto-resist layer coated on a substrate—i.e. refractive index, filmthickness, propagation and polarization effects—may also be captured aspart of the optical model 32. The mask model 30 captures the designfeatures of the reticle and may also include a representation ofdetailed physical properties of the mask, as described, for example, inU.S. Pat. No. 7,587,704. Finally, the resist model 34 describes theeffects of chemical processes which occur during resist exposure, PEBand development, in order to predict, for example, contours of resistfeatures formed on the substrate wafer. The objective of the simulationis to accurately predict, for example, edge placements and CDs, whichcan then be compared against the target design. The target design, isgenerally defined as the pre-OPC mask layout, and will be provided in astandardized digital file format such as GDSII or OASIS.

In general, the connection between the optical and the resist model is asimulated aerial image within the resist layer, which arises from theprojection of light onto the substrate, refraction at the resistinterface and multiple reflections in the resist film stack. The lightintensity distribution (aerial image) is turned into a latent ‘resistimage’ by absorption of photons, which is further modified by diffusionprocesses and various loading effects. Efficient simulation methods thatare fast enough for full-chip applications approximate the realistic3-dimensional intensity distribution in the resist stack by a2-dimensional aerial (and resist) image. An efficient implementation ofa lithography model is possible using the following formalism, where theimage (here in scalar form, which may be extended to includepolarization vector effects) is expressed as a Fourier sum over signalamplitudes in the pupil plane. According to the standard Hopkins theory,the aerial image may be defined by:

$\begin{matrix}\begin{matrix}{{I(x)} = {\sum\limits_{k}{{{A(k)}{\sum\limits_{k^{\prime}}{{M\left( {k^{\prime} - k} \right)}{P\left( k^{\prime} \right)}{\exp\left( {{- j}\; k^{\prime}x} \right)}}}}}^{2}}} \\{= {\sum\limits_{k}{{A(k)}^{2}\left\{ {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{{M\left( {k^{\prime} - k} \right)}{P\left( k^{\prime} \right)}{M^{*}\left( {k^{''} -} \right.}}}} \right.}}} \\\left. {\left. k \right){P^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}} \right\} \\{= {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}\left\lbrack {\sum\limits_{k}{{A(k)}^{2}{P\left( {k + k^{\prime}} \right)}{P^{*}\left( {k +} \right.}}} \right.}}} \\{\left. \left. k^{''} \right) \right\rbrack{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}} \\{= {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{{TCC}_{k^{\prime},k^{''}}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {- {j\left( {k^{\prime} -} \right.}} \right.}}}}} \\\left. {\left. k^{''} \right)x} \right)\end{matrix} & \left( {{Eq}.\mspace{14mu} 1} \right)\end{matrix}$where, I(x) is the aerial image intensity at point x within the imageplane (for notational simplicity, a two-dimensional coordinaterepresented by a single variable is utilized), k represents a point onthe source plane, A(k) is the source amplitude from point k, k′ and k″are points on the pupil plane, M is the Fourier transform of the maskimage, P is the pupil function, andTCC_(k′,k″)=Σ_(k)A(k)²P(k+k′)P*(k+k″). An important aspect of theforegoing derivation is the change of summation order (moving the sumover k inside) and indices (replacing k′ with k+k′ and replacing k″ withk+k″), which results in the separation of the Transmission CrossCoefficients (TCCs), defined by the term inside the square brackets inthe third line in the equation. These coefficients are independent ofthe mask pattern and therefore can be pre-computed using knowledge ofthe optical elements or configuration only (e.g., NA and a or thedetailed illuminator profile). It is further noted that although in thegiven example (Eq.1) is derived from a scalar imaging model, thisformalism can also be extended to a vector imaging model, where TE andTM polarized light components are summed separately.

Furthermore, the approximate aerial image can be calculated by usingonly a limited number of dominant TCC terms, which can be determined bydiagonalizing the TCC matrix and retaining the terms corresponding toits largest eigenvalues, i.e.,

$\begin{matrix}{{TCC}_{k^{\prime},k^{''}} = {\sum\limits_{i = 1}^{N}\;{\lambda_{i}{\phi_{i}\left( k^{\prime} \right)}{\phi_{i}^{*}\left( k^{''} \right)}}}} & \left( {{Eq}.\mspace{14mu} 2} \right)\end{matrix}$where λ_(i) (i=1, . . . , N) denotes the N largest eigenvalues andϕ_(i)(•) denotes the corresponding eigenvector of the TCC matrix. It isnoted that (Eq.2) is exact when all terms are retained in theeigenseries expansion, i.e., when N is equal to the rank of the TCCmatrix. However, in actual applications, it is typical to truncate theseries by selecting a smaller N to increase the speed of the computationprocess.

Thus, (Eq.1) can be rewritten as:

$\begin{matrix}\begin{matrix}{{I(x)} = {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{{TCC}_{k^{\prime},k^{''}}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}}}}} \\{= {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{\overset{N}{\sum\limits_{i = 1}}{\lambda_{i}{\phi_{i}\left( k^{\prime} \right)}{\phi_{i}^{*}\left( k^{''} \right)}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}}}}}} \\{= {\overset{N}{\sum\limits_{i = 1}}{\lambda_{i}{\sum\limits_{k^{\prime}}{{\phi_{i}\left( k^{\prime} \right)}{M\left( k^{\prime} \right)}{\exp\left( {{- j}\; k^{\prime}x} \right)}{\sum\limits_{k^{''}}\phi_{i}^{*}}}}}}} \\{\left( k^{''} \right){M^{*}\left( k^{''} \right)}{\exp\left( {{jk}^{''}x} \right)}} \\{= {\underset{i = 1}{\sum\limits^{N}}{\lambda_{i}{{\Phi_{i}(x)}}^{2}}}}\end{matrix} & \left( {{Eq}.\mspace{14mu} 3} \right)\end{matrix}$where

${\Phi_{i}(x)} = {\sum\limits_{k^{''}}{{\phi_{i}\left( k^{''} \right)}{M\left( k^{''} \right)}{\exp\left( {{- j}\; k^{''}x} \right)}}}$and |•| denotes the magnitude of a complex number.

Using a sufficiently large number of TCC terms and a suitable modelcalibration methodology allows for an accurate description of theoptical projection process and provides for ‘separability’ of thelithographic simulation model into the optics and resist models orparts. In an ideal, separable model, all optical effects such as NA,sigma, defocus, aberrations etc. are accurately captured in the opticalmodel module, while only resist effects are simulated by the resistmodel. In practice, however, all ‘efficient’ lithographic simulationmodels (as opposed to first-principle models, which are generally tooslow and require too many adjustable parameters to be practical forfull-chip simulations) are empirical to some extent and will use alimited set of parameters. There may in some cases be ‘lumped’parameters that account for certain combined net effects of both opticaland resist properties. For example, diffusion processes during PEB ofresist can be modeled by a Gaussian filter that blurs the image formedin resist, while a similar filter might also describe the effect ofstray light, stage vibration, or the combined effect of high-orderaberrations of the projection system. Lumped parameters can reproduceprocess behavior close to fitted calibration points, but will haveinferior predictive power compared with separable models. Separabilitytypically requires a sufficiently detailed model form—in the exampleabove, e.g., using 2 independent filters for optical blurring and resistdiffusion—as well as a suitable calibration methodology that assuresisolation of optical effects from resist effects.

While a separable model may generally be preferred for mostapplications, it is noted that the description of through-process window“PW” aerial image variations associated with the method of the presentinvention set forth below does not require strict model separability.Methods for adapting a general resist model in order to accuratelycapture through-PW variations are also detailed below in conjunctionwith the method of the present invention.

The present invention provides the efficient simulation of lithographicpatterning performance covering parameter variations throughout aprocess window, (for instance a variation of exposure dose and defocusand/or other process parameters). In the remainder of the document theremay be specific references to exposure dose and defocus only, but thisis merely for clearity of an explanation, other process parameters couldbe filled in there as well. To summarize, using an image-based approach,the method provides polynomial series expansions for characteristics ofaerial images (such as aerial image intensity) or of resist images as afunction of process parameters variations such as (de-)focus, exposuredose, and/or other additional coordinates of a generalized processwindow (PW). These expressions involve images and derivative imageswhich relate to TCCs and derivative TCC matrices. Linear combinations ofthese expressions allow for a highly efficient evaluation of the imagegenerated at any arbitrary process window (PW) point. In addition, edgeplacement shifts or critical dimension (CD) variations throughout theprocess window (PW) are also expressed in analytical form as simplelinear combinations of a limited set of simulated images. This set ofimages may be generated within a computation time on the order ofapproximately 2 times the computation time for computing a single imageat NC (Nominal Condition), rather than N× by computing images at Nseparate PW conditions. Once this set of images is known, the completethrough-PW behavior of every single edge or CD on the design can beimmediately determined.

It is noted that the methods of the present invention may also beutilized in conjunction with model calibration, lithography designinspection, yield estimates based on evaluating the common PW,identification of hot spots, modification and repair of hot spots byPW-aware OPC, and model-based process control corrections, e.g., tocenter the common PW of a litho layer.

The basic approach of the method can be understood by consideringthrough-focus changes in resist line width (or edge placement) of ageneric resist line. It is well known that the CD of the resist linetypically has a maximum or minimum value at best focus, but the CDvaries smoothly with defocus in either direction. Therefore, thethrough-focus CD variations of a particular feature may be approximatedby a polynomial fit of CD vs. defocus, e.g. a second-order fit for asufficiently small defocus range. However, the direction and magnitudeof change in CD will depend strongly on the resist threshold (dose toclear), the specific exposure dose, feature type, and proximity effects.Thus, exposure dose and through-focus CD changes are strongly coupled ina non-linear manner that prevents a direct, general parameterization ofCD or edge placement changes throughout the PW space.

However, the aerial image is also expected to show a continuousvariation through focus. Every mask point may be imaged to afinite-sized spot in the image plane that is characterized by the pointspread function of the projection system. This spot will assume aminimum size at best focus but will continuously blur into a widerdistribution with both positive and negative defocus. Therefore, it ispossible to approximate the variation of image intensities through focusas a second-order polynomial for each individual image point within theexposure field:I(x,ƒ)=I ₀(x)+a(x)·(ƒ−ƒ₀)+b(x)·(ƒ−ƒ₀)²  (Eq. 4)where ƒ₀ indicates the nominal or best focus position, and ƒ is theactual focus level at which the image I is calculated. The second-orderapproximation is expected to hold well for a sufficiently small defocusrange, but the accuracy of the approximation may easily be improved byincluding higher-order terms if required (for example, 3^(rd) orderand/or 4^(th) order terms). In fact, (Eq.4) can also be identified asthe beginning terms of a Taylor series expansion of the aerial imagearound the nominal best focus plane:

$\begin{matrix}{{I\left( {x,f} \right)} = {{I\left( {x,f_{0}} \right)} + {\frac{\partial{I\left( {x,f} \right)}}{\partial f}{{_{f = f_{0}}{{\cdot \left( {f - f_{0}} \right)} + {2\frac{\partial^{2}{I\left( {x,f} \right)}}{\partial f^{2}}}}}_{f = f_{0}} \cdot \left( {f - f_{0}} \right)^{2}}}}} & \left( {{Eq}.\mspace{14mu} 5} \right)\end{matrix}$which can in principle be extended to an arbitrarily sufficientrepresentation of the actual through-focus behavior of the aerial imageby extension to include additional higher-order terms. It is noted thatthe choice of polynomial base functions is only one possibility toexpress a series expansion of the aerial image through focus, and themethods of the current invention are by no means restricted to thisembodiment, e.g., the base functions can be special functions such asBessel Functions, Legendre Functions, Chebyshev Functions, Trigonometricfunctions, and so on. In addition, while the process window term is mostcommonly understood as spanning variations over defocus and exposuredose, the process window concept can be generalized and extended tocover additional or alternative parameter variations, such as variationof NA and sigma, etc.

Comparison of (Eq.4) and (Eq.5) reveals the physical meaning of theparameters “a” and “b” as first and second-order derivative images.These may in principle be determined directly as derivatives by a finitedifference method for every image point and entered into (Eq.4) and(Eq.5) to interpolate the image variations. Alternatively, in order toimprove the overall agreement between the interpolation and the actualthrough focus variation over a wider range, the parameters a and b canbe obtained from a least square fit of (Eq.4) over a number of focuspositions {ƒ₁, ƒ₂, . . . , ƒ_(L)} for which aerial images are explicitlycalculated as {I₁, I₂, . . . , I_(L)}. The parameters “a” and “b” arethen found as solutions to the following system of equations in a leastsquare sense (assuming here that L>3, in which case the system ofequations is over-determined).

Without loss of generality, it is assumed that ƒ₀=0 so as to simplifythe notation. Then for a fixed image point,

$\begin{matrix}{{I_{1} = {I_{0} + {a \cdot f_{1}} + {b \cdot f_{1}^{2}}}}{I_{2} = {I_{0} + {a \cdot f_{2}} + {b \cdot f_{2}^{2}}}}\ldots{I_{L} = {I_{0} + {a \cdot f_{L}} + {b \cdot f_{L}^{2}}}}} & \left( {{Eq}.\mspace{14mu} 6} \right)\end{matrix}$where I₀ is the aerial image at nominal conditions (NC), i.e. ƒ=ƒ₀. Thesolution to the above set of equations minimizes the following sum ofsquared differences, with the index l referring to the L different focusconditions:

$\begin{matrix}{G = {\underset{l = 1}{\sum\limits^{L}}{W_{l} \cdot \left( {I_{l} - I_{0} - {a \cdot f_{l}} - {b \cdot f_{l}^{2}}} \right)^{2}}}} & \left( {{Eq}.\mspace{14mu} 7} \right)\end{matrix}$where W_(l) is a user-assigned weight to defocus ƒ_(l) (l=1, 2, . . . ,L). Through {W₁, W₂, . . . , W_(L)}, it is possible to assign differentweights to different focuses. For example, in order to make the 2^(nd)order polynomial approximation have a better match at PW points closerto NC, it is possible to assign a larger weight close to NC and asmaller weight away from NC; or if it is desired for all focus points tohave equal importance, one can simply assign equal weights, i.e., W₁=W₂=. . . =W_(L)=1. For large deviations in focus and dose relative to thenominal condition, many patterns become unstable in printing and themeasurements of CDs become unreliable, in such cases it may be desirableto assign small weights to such process window conditions.

To solve (Eq.7), it is noted that the best fit will fulfill theconditions:

$\begin{matrix}{{\frac{\partial G}{\partial a} \equiv 0},{{{and}\mspace{14mu}\frac{\partial G}{\partial b}} \equiv 0}} & \left( {{Eq}.\mspace{14mu} 8} \right)\end{matrix}$(Eq.8) can be solved analytically, resulting in immediate expressionsfor “a” and “b” as the linear combination or weighted sum of the{I_(l)}, as shown below. The coefficients of this linear combination donot depend on the pixel coordinate or pattern, but only on the values ofthe {ƒ_(l)} and {W_(l)}. As such, these coefficients can be understoodas forming a linear filter for the purpose of interpolation in the spaceoff, and the particular choice of polynomials as base functions givesrise to the specific values of the coefficients, independent of the maskpattern. More specifically, the calculation of these coefficients isperformed once the values of {ƒ_(l)} and {W_(l)} are determined, withoutknowing the specific optical exposure settings or actually carrying outaerial image simulations.

With regard to solving (Eq.8), (Eq.7) can be rewritten as:

$\begin{matrix}{G = {\underset{l = 1}{\sum\limits^{L}}{W_{l} \cdot \left( {I_{l} - I_{0} - {a \cdot f_{l}} - {b \cdot f_{l}^{2}}} \right)^{2}}}} \\{= {\underset{l = 1}{\sum\limits^{L}}{W_{l} \cdot \left( {{b \cdot f_{l}^{2}} + {a \cdot f_{l}} - {\Delta\; I_{l}}} \right)^{2}}}}\end{matrix}$ where  Δ I_(l) = I_(l) − I₀   for  l = 1, 2, K, L.As a result, (Eq.8) can be expanded as:

$\begin{matrix}{\frac{\partial G}{\partial a} = {\sum\limits_{l = 1}^{L}\;{{W_{l} \cdot 2}{\left( {{b \cdot f_{l}^{2}} + {a \cdot f_{l}} - {\Delta\; I_{l}}} \right) \cdot f_{l}}}}} \\{= {{2\;{a \cdot {\sum\limits_{l = 1}^{L}\;{W_{l} \cdot f_{l}^{2}}}}} + {2\;{b \cdot {\sum\limits_{l = 1}^{L}\;{W_{l} \cdot f_{l}^{3}}}}} - {2 \cdot {\sum\limits_{l = 1}^{L}\;{{W_{l} \cdot \Delta}\;{I_{l} \cdot f_{l}}}}}}} \\{= {{2\;{a \cdot \alpha_{2}}} + {2\;{b \cdot \alpha_{3}}} - {2\;\Phi_{1}}}} \\{\equiv 0}\end{matrix}$ $\begin{matrix}{\frac{\partial G}{\partial b} = {\sum\limits_{l = 1}^{L}\;{{W_{l} \cdot 2}{\left( {{b \cdot f_{l}^{2}} + {a \cdot f_{l}} - {\Delta\; I_{l}}} \right) \cdot f_{l}^{2}}}}} \\{= {{2\;{a \cdot {\sum\limits_{l = 1}^{L}\;{W_{l} \cdot f_{l}^{3}}}}} + {2\;{b \cdot {\sum\limits_{l = 1}^{L}\;{W_{l} \cdot f_{l}^{4}}}}} - {2 \cdot {\sum\limits_{l = 1}^{L}\;{{W_{l} \cdot \Delta}\;{I_{l} \cdot f_{l}^{2}}}}}}} \\{= {{2\;{a \cdot \alpha_{3}}} + {2\;{b \cdot \alpha_{4}}} - {2\;\Phi_{2}}}} \\{\equiv 0}\end{matrix}$Thus.

$\begin{matrix}{{{a = {\frac{{\alpha_{4}\Phi_{1}} - {\alpha_{3}\Phi_{2}}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = {{\sum\limits_{l = 1}^{L}\;{h_{al}\Delta\; I_{l}}} = {\sum\limits_{l = 1}^{L}\;{h_{al}\left( {I_{l} - I_{0}} \right)}}}}},{b = {\frac{{\alpha_{2}\Phi_{2}} - {\alpha_{3}\Phi_{1}}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = {{\sum\limits_{l = 1}^{L}\;{h_{bl}\Delta\; I_{l}}} = {\sum\limits_{l = 1}^{L}\;{h_{bl}\left( {I_{l} - I_{0}} \right)}}}}}}{where}{{\alpha_{2} = {\sum\limits_{l = 1}^{L}\;{W_{l} \cdot f_{l}^{2}}}},{\alpha_{3} = {\sum\limits_{l = 1}^{L}\;{W_{l} \cdot f_{l}^{3}}}},{\alpha_{4} = {\sum\limits_{l = 1}^{L}\;{W_{l} \cdot f_{l}^{4}}}},{\Phi_{1} = {\sum\limits_{l = 1}^{L}\;{{W_{l} \cdot \Delta}\;{I_{l} \cdot f_{l}}}}},{\Phi_{2} = {\sum\limits_{l = 1}^{L}\;{{W_{l} \cdot \Delta}\;{I_{l} \cdot f_{l}^{2}}}}},{h_{al} = \frac{W_{l} \cdot f_{l} \cdot \left( {\alpha_{4} - {\alpha_{3} \cdot f_{l}}} \right)}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}}},{h_{bl} = \frac{W_{l} \cdot f_{l} \cdot \left( {{\alpha_{2} \cdot f_{l}} - \alpha_{3}} \right)}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}}}}} & \left( {{Eq}.\mspace{14mu} 9} \right)\end{matrix}$Note that:

$\begin{matrix}{{{\sum\limits_{l = 1}^{L}\;\left\lbrack {h_{al} \cdot f_{l}} \right\rbrack} = {\frac{{\alpha_{4} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {W_{l} \cdot f_{l}^{2}} \right\rbrack}} - {\alpha_{3} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {W_{l} \cdot f_{l}^{3}} \right\rbrack}}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = {\frac{{\alpha_{4}\alpha_{2}} - \alpha_{3}^{2}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = 1}}}{{\sum\limits_{l = 1}^{L}\;\left\lbrack {h_{al} \cdot f_{l}^{2}} \right\rbrack} = {\frac{{\alpha_{4} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {W_{l} \cdot f_{l}^{3}} \right\rbrack}} - {\alpha_{3} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {W_{l} \cdot f_{l}^{4}} \right\rbrack}}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = {\frac{{\alpha_{4}\alpha_{3}} - {\alpha_{3}\alpha_{4}}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = 0}}}{{\sum\limits_{l = 1}^{L}\;\left\lbrack {h_{bl} \cdot f_{l}} \right\rbrack} = {\frac{{\alpha_{2} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {W_{l} \cdot f_{l}^{3}} \right\rbrack}} - {\alpha_{3} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {W_{l} \cdot f_{l}^{2}} \right\rbrack}}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = {\frac{{\alpha_{2}\alpha_{3}} - {\alpha_{3}\alpha_{2}}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = 0}}}{{\sum\limits_{l = 1}^{L}\;\left\lbrack {h_{bl} \cdot f_{l}^{2}} \right\rbrack} = {\frac{{\alpha_{2} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {W_{l} \cdot f_{l}^{4}} \right\rbrack}} - {\alpha_{3} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {W_{l} \cdot f_{l}^{3}} \right\rbrack}}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = {\frac{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}}{{\alpha_{2}\alpha_{4}} - \alpha_{3}^{2}} = 1}}}} & \left( {{Eq}.\mspace{14mu} 10} \right)\end{matrix}$As is made clear below, this property will be useful in the resist modelsection. The above set of equations can be readily generalized toaccommodate a higher-order polynomial fitting.

The benefit of introducing the derivative images “a” and “b” is thatusing (Eq.4), the aerial image can be predicted at any point of theprocess window by straightforward scaling of the a and b images by thedefocus offset and a simple addition, rather than performing a fullimage simulation (i.e., convolution of the mask pattern with the TCCs)at each particular defocus setting required for a PW analysis. Inaddition, changes in exposure dose can be expressed by a simpleupscaling or downscaling of the image intensity by a factor (l+ε):I(x,ƒ,l+ε)=(1+ε)·I(x,ƒ)  (Eq. 11)where I(x,ƒ) is the aerial image at the nominal exposure dose, while εis the relative change in dose.Combining this with (Eq.4) yields the general result:

                                   (Eq.  12) $\begin{matrix}{{I\left( {x,f,{1 + ɛ}} \right)} = {\left( {1 + ɛ} \right) \cdot \left\lbrack {{I_{0}(x)} + {{a(x)} \cdot \left( {f - f_{0}} \right)} + {{b(x)} \cdot \left( {f - f_{0}} \right)^{2}}} \right\rbrack}} \\{= {{I_{0}(x)} + \left\lbrack {{ɛ \cdot {I_{0}(x)}} + {{\left( {1 + ɛ} \right) \cdot a}{(x) \cdot}}} \right.}} \\\left. {\left( {f - f_{0}} \right) + {\left( {1 + ɛ} \right) \cdot {b(x)} \cdot \left( {f - f_{0}} \right)^{2}}} \right\rbrack \\{= {{I_{0}(x)} + {\Delta\;{I(x)}}}}\end{matrix}$where ΔI will typically be small perturbations within a reasonable rangeof PW parameter variations.

The foregoing method is exemplified by a flow diagram in FIG. 3 wherethe contours, CD or Edge Placement Errors (EPEs) are to be extractedfrom the aerial image at different defocus conditions. Referring to FIG.3, the first step (Step 40) in the process is to identify the targetpattern or mask pattern to be simulated and the process conditions to beutilized. The next step (Step 42) is to generate a nominal image I_(O)and M defocus images {I_(l)} in accordance with (Eq.3) above.Thereafter, derivative images “a” and “b” are generated utilizing (Eq.9)(Step 43). The next step (Step 44) entails generating the defocus imageutilizing (Eq.4), i.e., the synthesis of I₀, a (scaled by ƒ) and b(scaled by ƒ²). Next, contours are extracted and CDs or feature EPEs aredetermined from the simulated image (Step 46). The process then proceedsto Step 48 to determine whether or not there is sufficient coverage(e.g., whether it is possible to determine the boundary of the processwindow) and if the answer is no, the process returns to Step 44 andrepeats the foregoing process. If there is sufficient coverage, theprocess is complete.

It is noted that if a sufficient coverage of the process window requiresevaluation at N process window points, and L<N images are used forfitting the derivative images a and b, the reduction in computation timewill be close to L/N, since scaling the predetermined images I₀, a and brequires significantly less computation time than an independentre-calculation of the projected image at each new parameter setting. Theforegoing method is generally applicable, independent of the specificdetails of the aerial image simulation. Furthermore, it is alsoapplicable to both the aerial image as well as to the resist image fromwhich simulated resist contours are extracted.

The foregoing method also does not depend on any specific model orimplementation used for simulating the set of aerial images {I₁, I₂ . .. , I_(L)} at varying defocus. However, the foregoing method requires anumber L>2 of individual images to be simulated for each mask layoutunder consideration. In a second embodiment of the method of the presentinvention, an even more efficient solution is made possible by the TCCformalism introduced in (Eq.1).

From (Eq.1), each aerial image at focus ƒ_(l) (1=0, 1, . . . , L) can bedefined as:I _(l)(x)=Σ_(k′)Σ_(k″)TCC_(l,k′,k″) M(k′)M*(k″)exp(−j(k′−k″)x)where TCC_(l) is the TCC at focus ƒ_(l) and TCC_(l,k′,k″) is the matrixelement of TCC_(l), and M(•) represents the mask image, which isindependent of the focus.

Combining this with (Eq.9) and exchanging the order of summation,

$\begin{matrix}{\mspace{689mu}{\left( {{Eq}.\mspace{14mu} 13} \right)\begin{matrix}{a = {\sum\limits_{l = 1}^{L}\;{h_{al}\left( {I_{l} - I_{0}} \right)}}} \\{= {\sum\limits_{l = 1}^{L}\;{h_{al}\left( {{\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{{TCC}_{l,k^{\prime},k^{''}}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}}}} -} \right.}}} \\{\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{{TCC}_{0,k^{\prime},k^{''}}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}}}} \\{= {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{\left\lbrack {\sum\limits_{l = 1}^{L}\;{h_{al}\left( {{TCC}_{l,k^{\prime},k^{''}} - {TCC}_{0,k^{\prime},k^{''}}} \right)}} \right\rbrack{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}}}}} \\{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}\end{matrix}}} \\\begin{matrix}{b = {\sum\limits_{l = 1}^{L}\;{h_{bl}\left( {I_{l} - I_{0}} \right)}}} \\{= {\sum\limits_{l = 1}^{L}\;{h_{bl}\left( {{\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{{TCC}_{l,k^{\prime},k^{''}}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}}}} -} \right.}}} \\{\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{{TCC}_{0,k^{\prime},k^{''}}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}}}} \\{= {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{\left\lbrack {\sum\limits_{l = 1}^{L}\;{h_{bl}\left( {{TCC}_{l,k^{\prime},k^{''}} - {TCC}_{0,k^{\prime},k^{''}}} \right)}} \right\rbrack{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}}}}} \\{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}\end{matrix}\end{matrix}$Thus if two new TCCs are defined as linear combinations of TCC_(l) (l=0,1, . . . , L) in the following way:

$\begin{matrix}{{A = {{\sum\limits_{l = 1}^{L}\;{h_{al}\Delta\;{TCC}_{l}}} = {\sum\limits_{l = 1}^{L}\;{h_{al}\left( {{TCC}_{l} - {TCC}_{0}} \right)}}}},{B = {{\sum\limits_{l = 1}^{L}\;{h_{bl}\Delta\;{TCC}_{l}}} = {\sum\limits_{l = 1}^{L}\;{h_{bl}\left( {{TCC}_{l} - {TCC}_{0}} \right)}}}}} & \left( {{Eq}.\mspace{14mu} 14} \right)\end{matrix}$then “a” and “b” are “aerial images” which can be computed directly fromA and B, i.e.,

$\begin{matrix}{{{a(x)} = {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{A_{k^{\prime},k^{''}}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}}}}}{{b(x)} = {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{B_{k^{\prime},k^{''}}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}}}}}} & \left( {{Eq}.\mspace{14mu} 15} \right)\end{matrix}$where

$A_{k^{\prime},k^{''}} = {\sum\limits_{l = 1}^{L}\;{{h_{al}\left( {{TCC}_{l,k^{\prime},k^{''}} - {TCC}_{0,k^{\prime},k^{''}}} \right)}\mspace{14mu}{and}}}$$B_{k^{\prime},k^{''}} = {\sum\limits_{l = 1}^{L}\;{h_{bl}\left( {{TCC}_{l,k^{\prime},k^{''}} - {TCC}_{0,k^{\prime},k^{''}}} \right)}}$are the matrix elements of A and B, respectively.

This also implies that a linear combination of aerial images ofdifferent planes can be computed using a single linear combination ofTCCs corresponding to those planes.

A significant advantage of using TCC₀, A, and B in place of the Lthrough-focus images is that the TCC₀, A, and B can be pre-computed,independently of the actual mask pattern, for known illumination andprojection parameters, giving rise to the possibility of furtherreduction of computing time (down from L through-focus simulations foreach mask pattern), which will be further explained below. It is notedthat the generation of A and B neither requires computation of a set ofaerial images at different defocus conditions nor requires calibrationfrom this set of aerial images. Once TCC₀, A, and B have beencalculated, these terms can be generally applied to predict thethrough-focus imaging performance for any specific mask design using(Eq.15) and (Eq.4). Besides the through-focus variation, a variation ofexposure dose around nominal condition can be applied to the TCC termsby the same linear scaling as described by (Eq.11) and (Eq.12) above.

Calculating the derivative images a and b from TCCs A and B allows afurther reduction of computation time by using only the dominant termsof A and B, as in the discussions related to (Eq.2). More specifically,suppose the diagonalization of TCC₀, A and B is:

$\begin{matrix}{{{TCC}_{0} = {\sum\limits_{i = 1}^{N_{0}}\;{\lambda_{0,i}{\phi_{0,i}\left( k^{\prime} \right)}{\phi_{0,i}\left( k^{''} \right)}}}}{A = {\sum\limits_{i = 1}^{N_{A}}\;{\lambda_{A,i}{\phi_{A,i}\left( k^{\prime} \right)}{\phi_{A,i}\left( k^{''} \right)}}}}{B = {\sum\limits_{i = 1}^{N_{B}}\;{\lambda_{B,i}{\phi_{B,i}\left( k^{\prime} \right)}{\phi_{B,i}\left( k^{''} \right)}}}}} & \left( {{Eq}.\mspace{14mu} 16} \right)\end{matrix}$where λ_(0,i) (i=1, . . . , N₀) denotes the N₀ largest eigenvalues andϕ_(0,i)(•) denotes the corresponding eigenvector of the TCC matrix TCC₀;λ_(A,i) (i=1, . . . , N_(A)) denotes the N_(A) largest eigenvalues andϕ_(A,i)(•) denotes the corresponding eigenvector of the TCC matrix A;and λ_(B,i) (i=1, . . . , N_(B)) denotes the N_(B) largest eigenvaluesand ϕ_(B,i)(•) denotes the corresponding eigenvector of the TCC matrixB.

Then, from (Eq.3), for mask image M(•),

$\begin{matrix}{{{I_{0}(x)} = {\sum\limits_{i = 1}^{N_{0}}\;{\lambda_{0,i}{{\Phi_{0,i}(x)}}^{2}}}}{{a(x)} = {\sum\limits_{i = 1}^{N_{A}}\;{\lambda_{A,i}{{\Phi_{A,i}(x)}}^{2}}}}{{b(x)} = {\sum\limits_{i = 1}^{N_{B}}\;{\lambda_{B,i}{{\Phi_{B,i}(x)}}^{2}}}}} & \left( {{Eq}.\mspace{14mu} 17} \right)\end{matrix}$where I₀ is the nominal aerial image

${{\Phi_{0,i}(x)} = {\sum\limits_{k^{''}}{{\phi_{0,i}\left( k^{''} \right)}{M\left( k^{''} \right)}{\exp\left( {{- {jk}^{''}}x} \right)}}}},{{\Phi_{A,i}(x)} = {\sum\limits_{k^{''}}{{\phi_{A,i}\left( k^{''} \right)}{M\left( k^{''} \right)}{\exp\left( {{- {jk}^{''}}x} \right)}\mspace{14mu}{and}}}}$${\Phi_{B,i}(x)} = {\sum\limits_{k^{''}}{{\phi_{B,i}\left( k^{''} \right)}{M\left( k^{''} \right)}{{\exp\left( {{- {jk}^{''}}x} \right)}.}}}$

Utilizing a larger number of TCC terms generally improves the accuracyof the optical model and the separability of optical and resist modelcomponents. However, since the image or TCC derivatives relate torelatively minor image variations within the PW, typically on the orderof 10% in CD variation, a smaller number of terms may suffice for the Aand B terms than for the Nominal Condition TCC₀. For example, if 64terms are considered for TCC₀, (i.e., N₀=64), only 32 terms aretypically required for each of the A and B terms in order to achievesufficient CD prediction accuracy, i.e., N_(A)=N_(B)=32. In this case,approximately the same amount of computation time will be required togenerate the derivative images a and b as compared to the nominalcondition I₀. It is noted that, unlike the original TCC matrices, acoefficient TCC matrix such as A or B is in general notnon-negative-definite, which implies both positive and negativeeigenvalues exist for a derivative TCC matrix. Therefore, the leadingterms from the eigen-series expansion and truncation should include alleigenvalues with the largest absolute values, both positive andnegative.

Similar to (Eq.5), (Eq.14) can be derived alternatively from seriesexpansion. More specifically, the variation of TCC matrix elementsaround nominal or best focus ƒ₀ may also be expressed as a seriesexpansion:

$\begin{matrix}{{{TCC}_{k^{\prime},k^{''}}(f)} = \left. {{{TCC}_{k^{\prime},k^{''}}\left( f_{0} \right)} + \frac{\partial{{TCC}_{k^{\prime},k^{''}}(f)}}{\partial f}} \middle| {}_{f = f_{0}}{{\cdot \left( {f - f_{0}} \right)} + {2\frac{\partial^{2}{{TCC}_{k^{\prime},k^{''}}(f)}}{\partial f^{2}}}} \middle| {}_{f = f_{0}}{\cdot \left( {f - f_{0}} \right)^{2}} \right.} & \left( {{Eq}.\mspace{14mu} 18} \right)\end{matrix}$

Thus, the coefficients of the series expansion may be evaluated directlyby a numerical finite difference method, or again from a least-squarefitting to a number of individually calculated TCC terms correspondingto a set of focus positions, in a manner similar to the through-focusfitting of aerial images discussed in the previous section. The fittingapproach provides a wider range of validity, and introduces weightfactors to place more or less emphasis on certain parts of the PW. Thisapproach will follow (Eq.6)-(Eq.9) after replacing the set of testimages I_(l) by their corresponding TCCs in the equations. Consequently,the best fit derivative matrices A and B are obtained from the samelinear combination set forth above, also after formally replacing theI_(l) by TCC_(l), i.e.,

$\begin{matrix}{{A = {{\sum\limits_{l = 1}^{L}\;{h_{al}\Delta\;{TCC}_{l}}} = {\sum\limits_{l = 1}^{L}\;{h_{al}\left( {{TCC}_{l} - {TCC}_{0}} \right)}}}},{B = {{\sum\limits_{l = 1}^{L}\;{h_{bl}\Delta\;{TCC}_{l}}} = {\sum\limits_{l = 1}^{L}\;{h_{bl}\left( {{TCC}_{l} - {TCC}_{0}} \right)}}}}} & \left( {{Eq}.\mspace{14mu} 19} \right)\end{matrix}$where h_(al) and h_(bl) are again computed using (Eq.9). It is notedthat h_(al) and h_(bl) are constants that do not depend on the patternsor TCC_(l). Thus, A and B are simply a linear combination of the NominalCondition TCC₀ and a set of TCC's at various defocus conditions (TCC_(l)through TCC_(L)).

Note that (Eq.19) is the same as (Eq.14), thus these two alternativeapproaches lead to the same final formulation. Similarly, (Eq.4) canalso be derived from (Eq.15), (Eq.18), and (Eq.19).

The method of the second embodiment is exemplified by the flow diagramin FIG. 4 where the contours, CD or Edge Placement Errors (EPEs) are tobe extracted from the aerial image at different defocus conditions. Thefirst step (Step 50) in the process is to identify the process specificoptical conditions associated with the desired process. The next step(Step 52) is to generate a nominal condition TCC_(O) and L defocus{TCC_(l)}. Thereafter, derivative TCCs: A and B are generated utilizing(Eq.14) (Step 54). The next step (Step 58) generates images I₀, a, b byconvolution of the mask image with TCC₀, A and B utilizing (Eq.17).Next, for each mask design (Step 56), defocus image is synthesizedutilizing (Eq.4) (Step 60), thereby generating the simulated image.Next, contours are extracted and CDs or feature EPEs are determined fromthe simulated image (Step 62). The process then proceeds to Step 64 todetermine whether or not there is sufficient coverage to determine theboundary of the process window and if the answer is no, the processreturns to Step 58 and repeats the foregoing process. If there issufficient coverage, the process proceeds to Step 66 to determine if theimage produced by the mask design is within allowable error tolerances,and if so, the process is complete. If not, the process returns to Step56 so as to allow for adjustment and redesign of the mask. It is notedthat this last step is an optional step in the process.

In the flowchart of FIG. 4, the diagram shows PW analysis embeddedwithin a ‘mask variation loop’ which may be required, in particular, foriterative, PW-aware OPC modifications of an initial mask design. In thissituation, any improvement in computation speed for the through-PW imageassessment will be especially beneficial.

An additional reduction in computation time may be achieved by furthersuitable assumptions or a priori knowledge about the physics of theoptical system. For example, in the absence of strong aberrations, itcan be expected that the through-focus variation of aerial imageintensities will be an even (i.e. symmetrical) function of defocus.Therefore, it can be expected that the first-order derivatives “A” and“a” will be negligible under these conditions.

This simplification can be further justified by noting that the effectof defocus corresponds to a multiplication of the pupil function by aphase factor p=p₀ exp[ja(ƒ−ƒ₀)²], where the nominal focus is at ƒ₀=0.For small defocus the phase shift can be approximated by a Taylorexpansion: p=p₀·[1+ja(ƒ−ƒ₀)²], which does not contain a linear term.

All the above methods may also be extended to a generalized processwindow definition that can be established by different or additionalbase parameters in addition to exposure dose and defocus. These mayinclude, but are not limited to, optical settings such as NA, sigma,aberrations, polarization, or optical constants of the resist layer(whose effects on the imaging process are included in the optical model,i.e. the TCCs). As one example, including a variation of NA aroundnominal conditions, the aerial image can be expressed as:I(ƒ,NA)=I ₀ +a·(ƒ−ƒ₀)+b·(ƒ−ƒ₀)²+c·(NA−NA₀)+d·(NA−NA₀)²+ε·(ƒ−ƒ₀)·(NA−NA₀)  (Eq. 20)where I, I₀, a, . . . , e are 2-dimensional images and imagederivatives, respectively. The additional parameters “c”, “d”, and “e”can be determined by a least square fit to a set of simulated images ora set of simulated TCCs at varying parameter values for ƒ and NA, whilethe scaling with exposure dose as in (Eq.11) and (Eq.12) still applies.It is noted that, similar to (Eq.9), these parameters (a, b, c, d, andthe cross-term coefficient e) are again a linear combination of aerialimages {I_(l)}. The coefficients of this linear combination do notdepend on the pixel coordinate or pattern, but only on the values of the{ƒ_(l)}, {NA_(l)}, and/or the user-assigned weights {W_(l)}.

For this generalized PW model, simplifications based on physical insightare also possible. In case of NA variations, for example, it can beexpected that these will have a rather monotonous, linear effect on theimage variations, in which case (Eq.20) can be simplified by droppingthe higher order “d” and “e” terms in NA, possibly in addition to thelinear term in defocus. Also, for any generalized PW definition, thenumber of TCC terms used for calculating I₀ at Nominal Condition neednot be the same as the number of terms used for calculating imagevariations from the TCC derivatives A, B, . . . . A sufficientlyaccurate description of minor image variations due to small parametervariations around Nominal Condition may be achieved with a large numberof terms for I₀ and a significantly smaller number for the derivatives,in order to reduce the overall computation time.

For simplicity purposes, the following discussion will be based ondefocus and exposure dose. However, it should be noted that all thedisclosures herein can be extended to generalized PW with otherparameters such as NA, sigma, aberrations, polarization, or opticalconstants of the resist layer, as illustrated in (Eq.20).

In the embodiments set forth above, analytic expressions for the aerialimage in the vicinity of best focus for a range of PW parameters weredeveloped. The following descriptions derive similar expressions andmethods to calculate the resist image, which forms the basis forextraction of simulated resist contours across the PW.

Separable, Linear Resist Model

Although the response of photo resist to illumination by the projectedaerial image may be strongly nonlinear, having a thresholding behavior,many processes occurring in the resist layer, such as diffusion duringPEB, can be modeled by convoluting the aerial image with one or morelinear filters before applying the threshold. Such models will begenerally referred to as ‘linear’ resist models, and the latent resistimage for such models may be expressed schematically as:R(x)=P{I(x)}+R _(b)(x)  (Eq. 21)here, P{ } denotes the functional action of applying a linear filter(i.e. generally a convolution), while R_(b) is a mask loading bias thatis independent of the aerial image. The resist threshold is understoodto be included in R_(b) such that resist contours correspond tolocations where R(x)=₀.

Applying this model to the general, scaled, interpolated aerial imagederived above, i.e., (Eq.12, assuming ƒ₀=0 without loss of generality),results in

$\begin{matrix}\begin{matrix}{R = {\left\lbrack {{P\left\{ I_{0} \right\}} + R_{b}} \right\rbrack + {{ɛ \cdot P}\left\{ I_{0} \right\}} + {{\left( {1 + ɛ} \right) \cdot f \cdot P}\left\{ a \right\}} + {{\left( {1 + ɛ} \right) \cdot f^{2} \cdot P}\left\{ b \right\}}}} \\{= {R_{0} + {{ɛ \cdot P}\left\{ I_{0} \right\}} + {{\left( {1 + ɛ} \right) \cdot f \cdot P}\left\{ a \right\}} + {{\left( {1 + ɛ} \right) \cdot f^{2} \cdot P}\left\{ b \right\}}}} \\{= {R_{0} + {\Delta\;{R\left( {x,ɛ,f} \right)}}}}\end{matrix} & \left( {{Eq}.\mspace{14mu} 22} \right)\end{matrix}$where R₀ is the resist image at Nominal Condition (NC). All correctionsdue to changes in exposure dose and focus (or, other PW parameters) arederived by applying the same filter to the derivative images a, b as tothe image I₀ at NC, and simple scaling and summation of the correctionterms.

Moreover, the effect of a linear filter may be included in the imagingTCC formalism, since the convolution with a filter in the space domainis equivalent to a multiplication with the filter's Fourier seriescomponents in the frequency domain. Starting from an aerial imageexpression (Eq.1):I(x)=Σ_(k′)Σ_(k″)TCC_(k′,k″) M(k′)M*(k″)exp(−j(k′−k″)x)It is shown that the TCC matrix element at k′, k″ contributes to the(k′−k″) frequency component of I(x) by the amountTCC_(k′,k″)M(k′)M*(k″). Therefore, a resist image defined by:I(x)

g(x)where g(x) is a spatial filter with the Fourier transform being G(k),can be expressed as:

$\begin{matrix}{{{I(x)} \otimes {g(x)}} = {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{{TCC}_{k^{\prime},k^{''}}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}{G\left( {k^{\prime} - k^{''}} \right)}}}}} \\{= {\sum\limits_{k^{\prime}}{\sum\limits_{k^{''}}{{TCC}_{k^{\prime},k^{''}}^{new}{M\left( k^{\prime} \right)}{M^{*}\left( k^{''} \right)}{\exp\left( {{- {j\left( {k^{\prime} - k^{''}} \right)}}x} \right)}}}}}\end{matrix}$with a new TCC matrix defined asTCC^(new) _(k′,k″)=TCC_(k′,k″) G(k′−k″)

With this procedure, the linear filter is incorporated into thebi-linear TCC matrix, so all the computational procedures applicable toa purely optical aerial image may be applied to the linearly filteredaerial image. This property allows a significant reduction in overallcomputation time, since the complete resist image can be generated by asingle evaluation of (Eq.1), with the only modification of adding weightfactors corresponding to the Fourier coefficients of the filter P. Forany given mask design input, this formulation would allow to generatedirectly, in one pass each, the images P{I₀}, P{a}, P{b} from thepre-computed, filter-adjusted TCC₀, A, and B matrices. (Eq.22) thendefines the actual resist image for any arbitrary PW point as a linearcombination of these three images.

Non-Separable, Linear Resist Model

In the preceding discussion, it was implicitly assumed that allparameters of the linear filters establishing the resist model areconstant across the variations of the process window parameters. Thisequates to one condition for an overall separable lithography model:resist model parameters are independent of optical model parameters. Apragmatic test for separability is the ability to accurately calibratethe model and fit test data across the complete extent of the PW. Inpractice, the semi-empirical nature of models suitable for full-chiplithography simulation may preclude perfect separability and may requireresist model parameters that are allowed to vary with PW parameters suchas defocus, NA or sigma settings. For a physically motivated model, itshould be expected (or required as a constraint), though that the modelparameters vary smoothly under variation of the PW variables. In thiscase, the series expansion of the resist image may include derivativeterms of the resist model parameters.

For illustration purposes, consider defocus as the only PW parameter. Ifthe linear resist model is equivalent to a convolution with a linearfilter, (or a multitude of linear filters), a separable model may bedescribed by:R(x,ƒ)=P(x)

I(x,ƒ)+R _(b)(x)  (Eq. 23)while a non-separable model may require an explicit ƒ-dependence of thefilterR(x,ƒ)=P(x,ƒ)

I(x,ƒ)+R _(b)(x)  (Eq. 24)

Now, considering through-focus changes, a pro-forma series expansion maybe applied to (Eq.24), for illustration herein only up to first order:

$\begin{matrix}{\mspace{610mu}{\left( {{Eq}.\mspace{14mu} 25} \right)\begin{matrix}{{R\left( {x,f} \right)} = {{R\left( {x,f_{0}} \right)} + {\left\lbrack {{{a_{P}(x)} \otimes {I_{0}(x)}} + {{P\left( {x,f_{0}} \right)} \otimes {a(x)}}} \right\rbrack \cdot \left( {f - f_{0}} \right)} + \ldots}} \\{= {{R_{0}(x)} + {\Delta\;{R\left( {x,f} \right)}}}}\end{matrix}}} \\{{where}\mspace{605mu}\left( {{Eq}.\mspace{14mu} 26} \right){{a_{P}(x)} = \left. \frac{\partial{P\left( {x,f} \right)}}{\partial f} \right|_{f = f_{0}}}}\end{matrix}$

If the resist model parameters are found to vary continuously across thePW space, similar series expansion and fitting as introduced above forthe AI and TCCs can be applied to the resist model parameters duringmodel calibration. In this case a linear, derivative filter a_(p) can becalculated and be used in (Eq.25), which may also be extended in astraightforward way to include higher-order terms. In this situation,resist model parameters as well as aerial image variations are smoothlyinterpolated across the complete PW area. Both P and a_(p) can bedetermined in a through-PW model calibration step based on experimentalwafer data from test or gauge patterns.

However, even if resist model parameters appear to vary non-monotonouslyacross the PW, any piece-wise interpolation in between calibrationpoints could provide ‘best-guess’ resist model parameters for arbitraryPW points.

General Resist Model

For a general resist model that may include nonlinear operations such astruncations of the aerial or resist image, the straightforwardseparation into nominal condition and derivative terms, as shown in(Eq.22) will be no longer valid. However, there are three alternativemethods to deal with the non-linear operations.

i) Associated Linear Filter

First, it is assumed that the general variation of the resist imagethrough PW can be approximated formally by the second line in (Eq.22),with the reinterpretation that the linear filter P{ } will no longercorrectly describe the resist model at NC (Normal Condition). Instead,linear filter P{ } will be chosen to reproduce the best representationof differential resist image changes relative to the NC. While anonlinear model may ensure the most accurate model fitting at the NC, itmay require significantly more computation time than a linear model. Byrelying on such an associated linear filter to emulate the differentialthrough-PW behavior, only a single evaluation of the nonlinear modelwill be required to generate R₀(x), while PW analysis at a multitude ofPW conditions can be based on more efficient evaluation of P{I₀}, P{a},P{b}.

The coefficients of the nominal condition resist model as well as of theassociated filter may be determined from a unified model calibrationprocedure based on calibration test patterns and wafer gauge datacovering pattern variations and process window variations, as anextension of the method described in U.S. Pat. No. 7,587,704.

Further, once a valid unified PW model (FEM) has been generated andcalibrated in the manner set forth in U.S. Pat. No. 7,587,704, it willprovide the best prediction of through-PW changes of the resist image.The parameters of the optimum associated filter may then be determinedby minimizing the overall (RMS (root mean square)) difference betweenthe simplified model using the associated filter and the complete,calibrated model, without any need for additional experimentalcalibration data.

Using the full model, for any suitable number and range of teststructures, including e.g. 1-D (line/space) and 2-D (line ends etc)patterns, ‘correct’ resist images and contours can be simulated for anynumber of PW points. In addition, the values of the derivative images aand b can be calculated in the vicinity of the resist contours. For eachpattern, the change of R(x) through-PW will be calculated atpattern-specific gauge points, e.g. the tip of a line for a line-endtest pattern, or along any point of the NC resist contour. At each ofthese evaluation points x, throughΔR(x _(i),ε,ƒ)=R(x _(i),ε,ƒ)−R(x _(i),ε=0,ƒ=ƒ₀)=R(x _(i),ε,ƒ)  (Eq. 27)since x_(i) is assumed to be on a resist contour, whereR(x_(i),ε=0,ƒ=ƒ₀)=0. ΔR(x_(i),ε,ƒ) should be well approximated byΔR _(a)(x _(i))=ε·P{I ₀(x _(i))}+(1+ε)·ƒ·P{a(x _(i))}+(1+ε)·ƒ² ·P{b(x_(i))}  (Eq. 28)

Therefore, the optimal associated filter will minimize the sum ofsquared differences between (Eq.27) and (Eq.28), and can be determinedby a variety of known optimization algorithms. It is noted thatevaluation of (Eq.27) and (Eq.28) during the associated filter fittingshould be performed at resist contours, so that the resulting filtermost closely reproduces changes close to edge positions. Performance ofthe associated filter—in terms of accurately predicting changes in theresist image level— far away from edge positions is generally notrequired. After this fitting routine, the full-PW behavior of the resistimages is again described asR(x,ε,ƒ)=R ₀(x)+ΔR _(a)(x,ε,ƒ)  (Eq. 29)where the filtered differential images can be efficiently calculatedwithin the TCC formalism, the ΔR constitutes relatively smallperturbations, and the resist images at any arbitrary PW point can bepredicted from a simple linear combination of the four images R₀, P{I₀},P{a}, and P{b}.ii) Embedded Linearization

The above approach presents a linearized filter (i.e., the associatedfilter) which is optimal in that it is the single linear filter whichminimizes the (RMS) difference for all pattern-specific gauge points oralong any point of the NC (Nominal Condition) resist contour. Next, analternative approach is discussed which incorporates resist modellinearization in the computation of derivative resist images.

More specifically, after obtaining a and b in (Eq.2), the goal becomesidentifying R₀, Ra and Rb such that their linear combination (assumingthat ƒ₀=0 without loss of generality)R _(EL)(x,ƒ)=R ₀(x)+Ra(x)·ƒ+Rb(x)·ƒ²  (Eq. 30)is the best fit for

$\begin{matrix}\begin{matrix}{{R\left( {x,f_{l}} \right)} = {R\left\{ {I\left( {x,f_{l}} \right)} \right\}}} \\{= {R\left\{ {{I_{0}(x)} + {{a(x)} \cdot f_{l}} + {{b(x)} \cdot f_{l}^{2}}} \right\}}}\end{matrix} & \left( {{Eq}.\mspace{14mu} 31} \right)\end{matrix}$over a number of focus positions ƒ_(l)={ƒ₁, ƒ₂, . . . , ƒ_(L)} withpossibly a set of weights {W₁, W₂, . . . , W_(L)}, where R₀ is theresist image at NC. (Eq.31) is essentially applying the resist modelR{⋅} to the aerial image expressed in (Eq.2). It is noted that theresist model R{⋅} may be non-linear, thus Ra and Rb are not necessarilyP{a} and P{b} or R{a} and R{b}.

As such:

$\begin{matrix}{{{R_{0}(x)} = {R\left( {I_{0}(x)} \right)}}{{{Ra}(x)} = {\sum\limits_{l = 1}^{L}\;{h_{al}\left\lbrack {{R\left( {x,f_{l}} \right)} - {R_{0}(x)}} \right\rbrack}}}{{{Rb}(x)} = {\sum\limits_{l = 1}^{L}\;{h_{bl}\left\lbrack {{R\left( {x,f_{l}} \right)} - {R_{0}(x)}} \right\rbrack}}}} & \left( {{Eq}.\mspace{14mu} 32} \right)\end{matrix}$where h_(al) and h_(bl) are coefficients defined in (Eq.9). Thecoefficients only depend on {ƒ₁, ƒ₂, . . . ƒ_(L))} and possibly weights{W₁, W₂, . . . , W_(L)}, and they are independent of R(x,ƒ_(l)) orI(x,ƒ_(l)).

In general, the resist model R{⋅} can be separated as:R{I(x)}=P{I(x)}+P _(NL) {I(x)}+R _(b)  (Eq. 33)where Rb is a mask loading bias that is independent of the aerial imageI(x) or focus, P{ } is the linear filter operation and P_(NL){ } is somenon-linear operation.

Combining (Eq.32) and (Eq.33),

$\begin{matrix}\begin{matrix}{{{Ra}(x)} = {\sum\limits_{l = 1}^{L}\;{h_{al}\left\lbrack {{R\left( {x,f_{l}} \right)} - {R_{0}(x)}} \right\rbrack}}} \\{= {{\sum\limits_{l = 1}^{L}\;{h_{al}\left\lbrack {{P\left\{ {I\left( {x,f_{l}} \right)} \right\}} - {P\left\{ {I_{0}(x)} \right\}}} \right\rbrack}} +}} \\{\sum\limits_{l = 1}^{L}\;{h_{al}\left\lbrack {{P_{NL}\left\{ {I\left( {x,f_{l}} \right)} \right\}} - {P_{NL}\left\{ {I_{0}(x)} \right\}}} \right\rbrack}} \\{{{Rb}(x)} = {\sum\limits_{l = 1}^{L}\;{h_{bl}\left\lbrack {{R\left( {x,f_{l}} \right)} - {R_{0}(x)}} \right\rbrack}}} \\{= {{\sum\limits_{l = 1}^{L}\;{h_{bl}\left\lbrack {{P\left\{ {I\left( {x,f_{l}} \right)} \right\}} - {P\left\{ {I_{0}(x)} \right\}}} \right\rbrack}} +}} \\{\sum\limits_{l = 1}^{L}\;{h_{bl}\left\lbrack {{P_{NL}\left\{ {I\left( {x,f_{l}} \right)} \right\}} - {P_{NL}\left\{ {I_{0}(x)} \right\}}} \right\rbrack}}\end{matrix} & \left( {{Eq}.\mspace{14mu} 34} \right)\end{matrix}$

As discussed previously, since P{ } is a linear operation, then

$\begin{matrix}\begin{matrix}{{P\left\{ {I\left( {x,f_{l}} \right)} \right\}} = {P\left\{ {{I_{0}(x)} + {{a(x)} \cdot f_{l}} + {{b(x)} \cdot f_{l}^{2}}} \right\}}} \\{= {{P\left\{ {I_{0}(x)} \right\}} + {P{\left\{ {a(x)} \right\} \cdot f_{l}}} + {P{\left\{ {b(x)} \right\} \cdot f_{l}^{2}}}}}\end{matrix} & \left( {{Eq}.\mspace{14mu} 35} \right)\end{matrix}$

As expected, it is possible to derive the following result with the aidof (Eq.9) and (Eq.10) set forth above,

$\begin{matrix}{{{\sum\limits_{l = 1}^{L}\;{h_{al}\left\lbrack {{P\left\{ {I\left( {x,f_{l}} \right)} \right\}} - {P\left\{ {I_{0}(x)} \right\}}} \right\rbrack}} = {{\sum\limits_{l = 1}^{L}\;{h_{al}\left\lbrack {{P{\left\{ {a(x)} \right\} \cdot f_{l}}} + {P{\left\{ {b(x)} \right\} \cdot f_{l}^{2}}}} \right\rbrack}} = {{{P{\left\{ {a(x)} \right\} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {h_{al} \cdot f_{l}} \right\rbrack}}} + {P{\left\{ {b(x)} \right\} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {h_{al} \cdot f_{l}^{2}} \right\rbrack}}}} = {P\left\{ {a(x)} \right\}}}}}{\sum\limits_{l = 1}^{L}\;{h_{bl}\left\lbrack {{P\left\{ {I\left( {x,f_{l}} \right)} \right\}} - {P\left\{ {I_{0}(x)} \right\}}} \right\rbrack}} = {{\sum\limits_{l = 1}^{L}\;{h_{bl}\left\lbrack {{P{\left\{ {a(x)} \right\} \cdot f_{l}}} + {P{\left\{ {b(x)} \right\} \cdot f_{l}^{2}}}} \right\rbrack}} = {{{P{\left\{ {a(x)} \right\} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {h_{bl} \cdot f_{l}} \right\rbrack}}} + {P{\left\{ {b(x)} \right\} \cdot {\sum\limits_{l = 1}^{L}\;\left\lbrack {h_{bl} \cdot f_{l}^{2}} \right\rbrack}}}} = {P\left\{ {b(x)} \right\}}}}} & \left( {{Eq}.\mspace{14mu} 36} \right)\end{matrix}$Thus, Ra and Rb can computed from

$\begin{matrix}{{{{Ra}(x)} = {{P\left\{ {a(x)} \right\}} + {\sum\limits_{l = 1}^{L}\;{h_{al}\left\lbrack {{P_{NL}\left\{ {I\left( {x,f_{l}} \right)} \right\}} - {P_{NL}\left\{ {I_{0}(x)} \right\}}} \right\rbrack}}}}{{{Rb}(x)} = {{P\left\{ {b(x)} \right\}} + {\sum\limits_{l = 1}^{L}\;{h_{bl}\left\lbrack {{P_{NL}\left\{ {I\left( {x,f_{l}} \right)} \right\}} - {P_{NL}\left\{ {I_{0}(x)} \right\}}} \right\rbrack}}}}} & \left( {{Eq}.\mspace{14mu} 37} \right)\end{matrix}$

The benefits of this approach are that it does not attempt to capturethe differential through-PW behavior for all gauge points using a singlelinear filter. Rather, this approach minimizes the (RMS) difference foreach pixel, thereby improving the overall accuracy. In addition, thisapproach does not require identification of pattern-specific gaugepoints or all NC resist contour neighboring points. One drawback is thatthis approach slightly increases the computation complexity for Ra andRb. However, since the synthesis of through-PW resist images onlyrequire scaling and additions of R₀, Ra and Rb, the increase in thecomputation complexity of the derivative images is generallyinsignificant compared to the reduction in computation complexity ofthrough-PW resist images, especially for dense PW sampling.

iii) Polynomial Approximation of Non-Linear Operations

In a third approach, non-linear resist model operations are approximatedusing polynomials. More specifically, for truncation operations on imageI(x), for the purpose of emulating acid and base reaction effects,quadratic polynomials of the image provide a sufficient approximation.Another typical non-linear operation, the linear filtering of the imageslope, can be expressed precisely as the linear filtering of a quadraticfunction of the image gradient G{I(x)}=I(x)−I(x−1), thus the quadraticpolynomial of the aerial image I(x) itself. More specifically, lettingG{ } be the gradient operation and the linear filter be P_(Slope){•},then this non-linear operation can be expressed as:P _(Slope) {G{I(x)}}=P _(Slope){(I(x)−I(x−1))²}  (Eq. 38)

To summarize, the resist image from aerial image I(x) can beapproximated as:

$\begin{matrix}\begin{matrix}{{R\left\{ {I\left( {x,f} \right)} \right\}} = {{P_{1}\left\{ {I\left( {x,f} \right)} \right\}} + {P_{2}\left\{ {I^{2}\left( {x,f} \right)} \right\}} + {R_{b}(x)} +}} \\{P_{Slope}\left\{ \left( {{I\left( {x,f} \right)} - {I\left( {{x - 1},f} \right)}} \right)^{2} \right\}} \\{= {{P_{1}\left\{ {{I_{0}(x)} + {{a(x)} \cdot f} + {{b(x)} \cdot f^{2}}} \right\}} +}} \\{{P_{2}\left\{ \left( {{I_{0}(x)} + {{a(x)} \cdot f} + {{b(x)} \cdot f^{2}}} \right)^{2} \right\}} + {R_{b}(x)} +} \\{P_{Slope}\left\{ \left( {{I_{0}(x)} + {{a(x)} \cdot f} + {{b(x)} \cdot f^{2}} - {I_{0}\left( {x - 1} \right)} -} \right. \right.} \\\left. \left. {{{a\left( {x - 1} \right)} \cdot f} - {{b\left( {x - 1} \right)} \cdot f^{2}}} \right)^{2} \right\} \\{= {{P_{1}\left\{ {I_{0}(x)} \right\}} + {P_{1}{\left\{ {a(x)} \right\} \cdot f}} + {P_{1}{\left\{ {b(x)} \right\} \cdot f^{2}}} + {P_{2}\left\{ {I_{0}^{2}(x)} \right\}} +}} \\{{2\; P_{2}{\left\{ {{a(x)} \cdot {I_{0}(x)}} \right\} \cdot f}} + {P_{2}{\left\{ {{2\;{{b(x)} \cdot {I_{0}(x)}}} + {a^{2}(x)}} \right\} \cdot f^{2}}} +} \\{{2\; P_{2}{\left\{ {{a(x)} \cdot {b(x)}} \right\} \cdot f^{3}}} + {P_{2}{\left\{ {b^{2}(x)} \right\} \cdot f^{4}}} + {R_{b}(x)} +} \\{P_{Slope}\left\{ \left( {{G\left\{ I_{0} \right\}(x)} + {G\left\{ a \right\}{(x) \cdot f}} + {G\left\{ b \right\}{(x) \cdot f^{2}}}} \right)^{2} \right\}} \\{= {\left\{ {{P_{1}\left\{ {I_{0}(x)} \right\}} + {P_{2}\left\{ {I_{0}^{2}(x)} \right\}} + {P_{Slope}\left\{ {G^{2}\left\{ I_{0} \right\}(x)} \right\}} + {R_{b}(x)}} \right\} +}} \\{\left\{ {{P_{1}\left\{ {a(x)} \right\}} + {2\; P_{2}\left\{ {{a(x)} \cdot {I_{0}(x)}} \right\}} +} \right.} \\{{\left. {2\; P_{Slope}\left\{ {G\left\{ a \right\}{(x) \cdot G}\left\{ I_{0} \right\}(x)} \right\}} \right\} \cdot f} +} \\{\left\{ {{P_{1}\left\{ {b(x)} \right\}} + {P_{2}\left\{ {{2\;{{b(x)} \cdot {I_{0}(x)}}} + {a^{2}(x)}} \right\}} +} \right.} \\{{\left. {P_{Slope}\left\{ {{2\; G\left\{ a \right\}{(x) \cdot G}\left\{ I_{0} \right\}(x)} + {G^{2}\left\{ a \right\}(x)}} \right\}} \right\} \cdot f^{2}} +} \\{{2{\left\{ {{P_{2}\left\{ {{a(x)} \cdot {b(x)}} \right\}} + {P_{Slope}\left\{ {G\left\{ a \right\}{(x) \cdot G}\left\{ b \right\}(x)} \right\}}} \right\} \cdot f^{3}}} +} \\{\left\{ {{P_{2}\left\{ {b^{2}(x)} \right\}} + {P_{Slope}\left\{ {G^{2}\left\{ b \right\}(x)} \right\}}} \right\} \cdot f^{4}} \\{= {{R_{0}(x)} + {{R_{1}(x)} \cdot f} + {{R_{2}(x)} \cdot f^{2}} + {{R_{3}(x)} \cdot f^{3}} + {{R_{4}(x)} \cdot f^{4}}}}\end{matrix} & \left( {{Eq}.\mspace{14mu} 39} \right)\end{matrix}$

Once again, P₁{•} represents the linear filter for the aerial imageterm, P₂{•} represents the linear filter for the aerial image squareterm, and P_(Slope) {•} represents the linear filter for the aerialimage gradient term, while R_(b) is a mask loading bias that isindependent of the image pattern. Thus the resist image is expressed asa 4^(th)-order polynomial of the defocus value. However, in a typicalapplication, R₃(x) and R₄(x) are very small and may be ignored toimprove the computational efficiency.

As noted above, the goal of lithography design verification is to ensurethat printed resist edges and line widths are within a pre-specifieddistance from the design target. Similarly, the size of the processwindow—exposure latitude and depth of focus—are defined by CDs or edgeplacements falling within the specified margin. The various methodsoutlined above provide very efficient ways to determine the change ofresist image signal level with variation of focus and exposure dose orother, generalized PW parameters. Each method resulted in an approximateexpression of through-PW resist image variations ΔR as perturbation ofthe NC (Nominal Condition) image R₀.

In order to relate these changes in R(x) to changes in edge placement,in most cases a first-order approximation will suffice, due to the smallCD or edge placement tolerances. Therefore, the lateral shift of anyresist contour (R=0) (i.e., the edge placement change) is simplyapproximated by the image gradient G at the original (i.e. NC) contourlocation and the change in resist image level ΔR due to variation offocus, dose, etc. as:

$\begin{matrix}{{\Delta\;{{EP}\left( {x_{i},ɛ,f} \right)}} = \frac{\Delta\;{R\left( {x_{i},ɛ,f} \right)}}{G\left( {x_{i},{ɛ = 0},{f = f_{0}}} \right)}} & \left( {{Eq}.\mspace{14mu} 40} \right)\end{matrix}$where both the initial contour location and the gradient are determinedfrom the resist image at NC, i.e. R₀(x,y). The 2-dimensional edge shiftcan be calculated separately in x and y direction by the partial imagederivative in each direction, or as an absolute shift using an absolutegradient value, i.e. the geometrical sum of S_(x)=R₀(x,y)−R₀(x−1,y) andS_(y)=R₀(x,y)−R₀(x,y−1), i.e., the absolute gradient value S=√{squareroot over (S_(x) ²+S_(y) ²)}.

From the foregoing explanation, the edge shift can be directly expressedas a function of the differential images defined above:

$\begin{matrix}{{\Delta\;{{EP}\left( {x_{i},ɛ,f} \right)}} = {\frac{1}{S\left( x_{i} \right)}\left\lbrack {{{ɛ \cdot P}\left\{ {I_{0}\left( x_{i} \right)} \right\}} + {{\left( {1 + ɛ} \right) \cdot f \cdot P}\left\{ {a\left( x_{i} \right)} \right\}} + {{\left( {1 + ɛ} \right) \cdot f^{2} \cdot P}\left\{ {b\left( x_{i} \right)} \right\}}} \right\rbrack}} & \left( {{Eq}.\mspace{14mu} 41} \right)\end{matrix}$while changes in CD or line widths can be determined from adding theindividual edge placement shifts on either side of a line, resultinggenerally in ΔCD=2·ΔEP. Clearly, (Eq.41) will be able to reproduce thetypical 2^(nd) order-like through-focus behavior of CD or EPE curves.More importantly, after the set of images such as [R₀, P(I₀), P{a},P{b}] has been calculated, which may be accomplished with only ˜1× morecomputation than simulating the single image at NC (assuming that fewerTCC terms are required for sufficient accuracy on the differentials),(Eq.41) may be applied to map out analytically the complete PW for everysingle edge position on a design, without the need for any furthertime-consuming image simulation. A generic flow diagram to illustratethis method is provided in FIG. 5.

Referring to FIG. 5, the first step (Step 80) entails defining theprocess specific parameters associated with the lithography process andsystem that will be utilized in the imaging process. Thereafter,derivative TCCs A and B are generated utilizing (Eq.14) (Step 82). InStep 84, calibration test data is acquired for multiple process windowconditions. In Step 85, model parameters for R_(O){ } and/or associatedfilter P{ } are determined utilizing in part the results of Step 82.Next, the target mask pattern or design is defined (Step 86). Theprocess then proceeds to generate images such as R_(O)(x), P{I_(O)},P{a} and P{b} in Step 88. Next, the simulated image is synthesized, NCcontours are extracted, and feature EPEs are determined at a given setof edge positions {x_(i)}(Step 90). The process then proceeds to Step 92to determine EPE or CD-variations through process window at edgepositions {x_(i)}. Finally, in Step 94, the results obtained in Step 92are analyzed to determine whether the resulting image is within apredefined error tolerance, thus, determining a common process window aswell as identifying any problem area (i.e., hot-spots) within thedesign.

The methods detailed above, and in particular (Eq.41) can be appliedvery flexibly for a wide range of tasks in lithography designinspection. Some of these applications are briefly outlined below.However, it is noted that the present invention is not limited to theapplications disclosed herein.

For any particular edge or CD, (Eq.41) allows straightforwarddetermination of the focus latitude (=DOF (Depth of Focus)) at nominaldose, for a given tolerance of CD, EP or line end variation.

For any particular edge or CD, (Eq.41) allows straightforwarddetermination of the exposure dose at nominal focus, for a giventolerance of CD, EP or line end variation.

For any particular edge or CD, (Eq.41) allows straightforward mapping ofthe shape, center and area of the PW in {F,E} space or a generalized PWspace, for a given tolerance of CD, EP or line end variation.

For a set of edges or CDs covering the full chip design and all relevantpattern/feature types, the common process window of the design can beefficiently calculated, and process corrections may be derived in orderto center the common PW.

Critical, limiting patterns may be identified that define the innerboundaries of the common PW, by either having off-centered PWs or smallPWs.

The common PW area may be mapped out as a function of tolerance specs onEP or CD variations. This sensitivity analysis may provide a yieldestimate depending on design sensitivity.

Design hot spots may be identified from a full-chip analysis using(Eq.41), as patterns with PW area, DOF or exposure latitude fallingbelow a certain threshold. The behavior of these critical patterns maythen be investigated in detail by full-PW simulations, i.e. using thefull simulation model for repeated image and resist contour simulationat many points across the PW.

PWM-OPC (Process Window Maximizing—Optical Proximity Correction)Algorithm Description

As discussed previously, most existing model-based OPC approaches tweakthe feature edges so that the printed patterns are as close to thedesign intent as possible at nominal condition. However, this may resultin some features (hot spots) with very small tolerance for focus orexposure dose variation and some features having extremely smalloverlapping process window. All these will contribute to a rather smalloverall process window for the full chip. It becomes apparent that weneed a different approach in order to maximize the overall processwindow.

We first consider the gray-level resist image, on which the resistcontour is defined at a fixed threshold T. The resist contour outlinesthe printed patterns. The typical OPC approach would make the nominalresist contour match the design target edge as closely as possible, orequivalently, make the resist image intensity at the design targetlocations converge to T. As discussed previously, this turns out to beunnecessary and typically have an adversary impact on the overallprocess window. Therefore, we here disclose a methodology for OPC totruly maximize process window, which we call PWM-OPC (Process WindowMaximizing-Optical Proximity Correction).

For each evaluation point, we can determine the gray-level variationT_(r) around T so that the evaluation point's corresponding edgeplacement is acceptable over a range of process window (focus and dose)variations. Typically, there is a tolerance on the CD variation at thisevaluation point, which is denoted as ΔCD. Then the acceptablegray-level variation T_(r) may be approximated to the first order as:T_(r)=ΔEdge×slope, where ΔEdge can be set as half of the CD variation atthis evaluation point (ΔCD) and slope is the resist image intensityslope at the evaluation point. So the acceptable gray level range is[T₁, T₂], where T₁=T−T_(r)/2 and T₂=T+T_(r)/2.

The edge movement in OPC is done iteratively until the gray-level valueat each evaluation point converges to the target gray level. Before eachOPC iteration (i.e., before edge movement), the PWM-OPC finds theoptimal target gray level (at nominal condition) for each evaluationpoint, so that the process window is maximized if the evaluation pointreaches that target gray level at nominal condition after the currentiteration (i.e., after edge movement). Note that, in most existing OPC,this target gray level is always fixed at T, for every iteration andevery evaluation point. The PWM-OPC, however, dynamically sets thetarget gray level, for each evaluation point in each iteration, tomaximize the process window. The key idea of PWM-OPC is how to compute atarget gray level which is different from T, in order to better ensurethe actual gray level at the target control point stays within the range[T₁, T₂] over a certain process window. Note, although the target willbe changed for the nominal condition, the acceptable gray level rangeover process window does not move and stays centered on the originalvalue of T.

From the previous sections (e.g., (Eq. 12), (Eq. 14a), or (Eq. 16)), thegray level at a given focus and dose condition of the evaluation pointbefore an OPC iteration is,R(ε,ƒ)=P ₀ +ΔR(ε,ƒ)Where P₀ is the resist image intensity at nominal condition, ε is therelative change in exposure dose, and ƒ is the defocus value relative tothe nominal focus.

We assume the probability density of ƒ and ε during manufacturing isGaussian with standard deviation of σ_(ƒ) and σ_(ε) respectively (forreal applications, we can safely assume that σ_(ƒ)>0 and σ_(ε)>0), sothe joint probability density function of (ƒ,ε) is:

$\begin{matrix}{{P\left( {ɛ,f} \right)} = {\frac{1}{2\;\pi\;\sigma_{f}\sigma_{ɛ}}{\exp\left( {{- \frac{1}{2}}\left( {\left( \frac{f}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ}{\sigma_{ɛ}} \right)^{2}} \right)} \right)}}} & \left( {{Eq}.\mspace{14mu} 42} \right)\end{matrix}$

Our goal can then be mathematically formulated as follows: Find thedesired P₀∈[T₁,T₂] for each evaluation point which maximizes the minimumvalue of (ƒ/σ_(ƒ))²+(ε/σ_(ε))² on the contours of R(ƒ,ε)=T₁ andR(ƒ,ε)=T₂, or equivalently, on the contours of ΔR(ε,ƒ)=T₁−P₀ andΔR(ε,ƒ)=T₂−P₀.

This goal will guarantee, for each evaluation point individually, thenormalized distance between (ƒ=0, ε=0) and its process window boundary'snearest point to (ƒ=0, ε=0), is largest. So, when millions of evaluationpoints on a whole chip are combined, the common process window will belargest, i.e., has the highest total probability that all evaluationpoints are within spec.

We have now formulated the mathematical problem. We next disclose how tosolve it.

Assuming that ΔR(ε,ƒ) has the following form (note that this methodologyalso works for different formulation for ΔR(ε,ƒ) (e.g., with a higherorder polynomial expansion) or generalized process window maximization(e.g., process window with other parameters such as NA)):ΔR(ε,ƒ)=ε·R ₀+(1+ε)·ƒ·P _(a)+(1+ε)·ƒ² ·P _(b)  (Eq.43)Where R₀, P_(a), and P_(b) are some image coefficients that areindependent of ε and ƒ. To simplify the expression, we approximate theexposure (dose) change by threshold change, i.e., the threshold nowbecomes T+V_(ε), and ΔR(ε,ƒ)=ƒ·P_(a)+ƒ²·P_(b) then

$\begin{matrix}{{ɛ = {\frac{1}{V}\left( {{f \cdot P_{a}} + {f^{2} \cdot P_{b}} - K + P_{0}} \right)\mspace{14mu}{for}\mspace{14mu} a\mspace{14mu}{fixed}\mspace{14mu}{gray}\mspace{14mu}{level}\mspace{14mu} K}},} & \left( {{Eq}.\mspace{14mu} 44} \right)\end{matrix}$where K=T₁ or T₂ and V is the scaling factor in the relationship betweenthe threshold change and exposure dose change and is fixed for eachevaluation point. We assume without loss of generality that V>0. Notethat (Eq.44) can also be viewed as a direct approximation for (Eq.43)for small ε. The reason follows: for threshold K, the relationshipbetween ƒ and ε at the contour point isP ₀ +ε·R ₀+(1+ε)·ƒ·P _(a)+(1+ε)·ƒ² ·P _(b) =K.Then(P ₀ +ε·R ₀)/(1+ε)+ƒ·P _(a)+ƒ² ·P _(b) =K/(1+ε).As ε is small, Taylor expansion leads to 1/(1+ε)≈1−ε by ignoring thehigher order term, the relationship can be simplified asP ₀ +ƒ·P _(a)+ƒ² ·P _(b) =K−(K−R ₀ +P ₀)ε,which results in (Eq.44).Best Focus PWM-OPC

We first assume that the nominal condition is at (or very close to) theiso-focus, i.e., P_(a)≈0. This assumption is reasonable for most modernlithography processes where the nominal condition is adjusted to bearound the iso-focal point. Then let F=ƒ² and the object functionbecomes

$\begin{matrix}{{\left( \frac{f}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ}{\sigma_{ɛ}} \right)^{2}} = {{\frac{F}{\sigma_{f}^{2}} + \frac{ɛ^{2}}{\sigma_{ɛ}^{2}}} = {\frac{F}{\sigma_{f}^{2}} + \frac{\left( {{FP}_{b} - K + P_{0}} \right)^{2}}{V^{2}\sigma_{ɛ}^{2}}}}} & \left( {{Eq}.\mspace{14mu} 45} \right)\end{matrix}$We will minimize this object function in order to find the optimal P₀.Note that we have two additional constraints in this minimization:T₁≤P₀≤T₂ and F≥0.

The first constraint implies that all evaluation points' CDs should bewithin specifications at nominal condition. The second constraint isfrom the fact that F=ƒ². It becomes apparent that when the nominal focusis at the iso-focal point, both (Eq.42) and (Eq.44) are symmetric withrespect to ƒ=0. Thus, the process window is smaller if the nominal focusis shifted away from the iso-focal point. Therefore, to maximize processwindow, the nominal focus should be at iso-focus.

To solve this optimization problem, we consider several cases.

1) First, for the special case where P_(b)=0, from (Eq.45), F=0 resultsin the minimum object function value, which is

$\frac{\left( {K - P_{0}} \right)^{2}}{V^{2}\sigma_{ɛ}^{2}}.$As K=T₁ or T₂ and T₁≤P₀≤T₂, P₀=(T₁+T₂)/2 maximizes the minimum objectfunction value, thus is the optimal target gray level.2) For non-zero P_(b), we assume without loss of generality that P_(b)>0(The expressions would be similar otherwise). Let its derivative withrespect to F be 0, we have

${{\frac{1}{\sigma_{f}^{2}} + \frac{2\left( {{FP}_{b} - K + P_{0}} \right)P_{b}}{V^{2}\sigma_{ɛ}^{2}}} = 0},$which results in

$F_{0} = {\frac{1}{P_{b}}\left( {K - P_{0} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}}} \right)}$However, because of the constraints that T₁≤P₀≤T₂ and F≥0, this F₀ isnot always achievable.

Therefore, we will discuss in two sub-cases.

-   -   1. If

${{T_{2} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}}} < T_{1}},$since P₀∈[T₁,T₂], then F₀<0 for both K=T₁ and T₂. In this case, F=0again results in the minimum object function value, and P₀=(T₁+T₂)/2 isthe optimal target gray level. (If we take 1/0=∞, then we can considerthat P_(b)=0 as a special case of

$\left. {{T_{2} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}}} < {T_{1}.}} \right)$

-   -   2. Otherwise, then

${T_{2} - \frac{V^{2}\sigma_{ɛ}^{2}}{2P_{b}\sigma_{f}^{2}}} \geq {T_{1}.}$

-   -   In this case, F₁≤0 for K=T₁ as P₀≥T₁. Therefore, the minimum        object function value for K=T₁ is always achieved when F₀=0,        which is

$\frac{\left( {T_{1} - P_{0}} \right)^{2}}{V^{2}\sigma_{ɛ}^{2}}.$Note that this minimum increases monotonically from 0 to

$\frac{1}{V^{2}\sigma_{ɛ}^{2}}\left( {T_{2} - T_{1}} \right)^{2}$when P₀ increases from T₁ to T₂.

-   -   When K=T₂, things are little more complicated. When

${P_{0} > {T_{2} - \frac{V^{2}\sigma_{ɛ}^{2}}{2P_{b}\sigma_{f}^{2}}}},$then F₀<0; and F=0 again results in the minimum object function value

$\frac{\left( {T_{2} - P_{0}} \right)^{2}}{V^{2}\sigma_{ɛ}^{2}},$which decreases monotonically from

$\frac{V\;\sigma_{ɛ}^{2}}{4P_{b}^{2}\sigma_{f}^{4}}$to 0 when P₀ increases from

$T_{2} - \frac{V^{2}\sigma_{ɛ}^{2}}{2P_{b}\sigma_{f}^{2}}$to T₂; When

${P_{0} \leq {T_{2} - \frac{V^{2}\sigma_{ɛ}^{2}}{2P_{b}\sigma_{f}^{2}}}},$then F₀≥0 which leads to the genuine minimum object function value

$\left. {\frac{F}{\sigma_{f}^{2}} + \frac{\left( {{FP}_{b} - T_{2} + P_{0}} \right)^{2}}{V^{2}\sigma_{ɛ}^{2}}} \right|_{F = F_{0}} = {{{\frac{1}{P_{b}\sigma_{f}^{2}}\left( {T_{2} - P_{0} - \frac{V^{2}\sigma_{ɛ}^{2}}{2P_{b}\sigma_{f}^{2}}} \right)} + \frac{V^{2}\sigma_{ɛ}^{2}}{4P_{b}^{2}\sigma_{f}^{4}}} = {\frac{1}{P_{b}\sigma_{f}^{2}}\left( {T_{2} - P_{0} - \frac{V^{2}\sigma_{ɛ}^{2}}{4P_{b}\sigma_{f}^{2}}} \right)}}$which decreases monotonically from

$\frac{1}{P_{b}\sigma_{f}^{2}}\left( {T_{2} - T_{1} - \frac{V^{2}\sigma_{ɛ}^{2}}{4P_{b}\sigma_{f}^{2}}} \right)\mspace{14mu}{to}\mspace{14mu}\frac{V\;\sigma_{ɛ}^{2}}{4P_{b}\sigma_{f}^{4}}$when P₀ increases from T₁ to

$T_{2} - {\frac{V^{2}\sigma_{ɛ}^{2}}{2P_{b}\sigma_{f}^{2}}.}$

Therefore, the minimum object function value increases for P₀ when K=T₁while the minimum object function value decreases for P₀ when K=T₂. Thusin order to maximize the process window, we should again choose P₀ suchthat these two minimum object function values are identical.

Apparently, if

${{T_{2} - \frac{V^{2}\sigma_{ɛ}^{2}}{2P_{b}\sigma_{f}^{2}}} \leq {\left( {T_{1} + T_{2}} \right)/2}},$i.e.,

${T_{2} - T_{1}} \leq \frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}$then P₀=(T₁+T₂)/2 is once more the optimal target gray level, whichresults in the maximized minimum object function value being

$\frac{\left( {T_{2} - T_{1}} \right)^{2}}{4V^{2}\sigma_{ɛ}^{2}}.$This is exemplified in FIG. 6.

More specifically, define a characteristic number for the lithographyprocess:

${\tau = \frac{\left( {T_{2} - T_{1}} \right)P_{b}\sigma_{f}^{2}}{V^{2}\sigma_{ɛ}^{2}}};$and substituting this in for the condition,

${T_{2} - T_{1}} \leq {\frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}.}$Then we can see that, P₀=(T₁+T₂)/2 is the optimal target gray level whenτ≤1. FIG. 6 demonstrates this result for the case where τ=0.75.Otherwise,

${{T_{2} - T_{1}} > \frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}},$(i.e. τ>1), maximum process window is achieved when

$\frac{\left( {T_{1} - P_{0}} \right)^{2}}{V^{2}\sigma_{ɛ}^{2}} = {\frac{1}{P_{b}\sigma_{f}^{2}}{\left( {T_{2} - P_{0} - \frac{V^{2}\sigma_{ɛ}^{2}}{4\; P_{b}\sigma_{f}^{2}}} \right).}}$The roots of this quadratic equation are

$P_{0} = {T_{1} - {\frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}} \pm {\sqrt{\frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}\left( {T_{2} - T_{1}} \right)}.}}}$As we know that P₀∈[T₁,T₂] and

$\frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}} > 0$and T₂−T₁>0,

$P_{0} = {T_{1} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}} + \sqrt{\frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}\left( {T_{2} - T_{1}} \right)}}$is the only valid solution.

-   -   Now verify that this root is indeed valid.    -   Since

${{T_{2} - T_{1}} > \frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}},$then

$\begin{matrix}{P_{0} = {{{T_{1} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}} + \sqrt{\frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}\left( {T_{2} - T_{1}} \right)}} \leq {T_{1} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}} + T_{2} - T_{1}}} = {{T_{2} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}}} \leq T_{2}}}} & \; \\{P_{0} = {{{T_{1} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}} + \sqrt{\frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}\left( {T_{2} - T_{1}} \right)}} \geq {T_{1} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}} + \sqrt{\frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}} \times \frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}}}} = {{T_{1} + \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}}} \geq T_{1}}}} & \;\end{matrix}$

-   -   Therefore, P₀∈[T₁,T₂].    -   In summary, when τ>1,

$P_{0} = {T_{1} - \frac{V^{2}\sigma_{ɛ}^{2}}{2\; P_{b}\sigma_{f}^{2}} + \sqrt{\frac{V^{2}\sigma_{ɛ}^{2}}{P_{b}\sigma_{f}^{2}}\left( {T_{2} - T_{1}} \right)}}$is the optimal target gray level.

-   -   This case is demonstrated in FIG. 7, more specifically where τ=2

To summarize, the optimal target gray level depending on thecharacteristic number for the lithography process:

$\tau = {\frac{\left( {T_{2} - T_{1}} \right)P_{b}\sigma_{f}^{2}}{V^{2}\sigma_{ɛ}^{2}}.}$

When τ>1, then

$P_{0} = {T_{1} + {\left( {T_{2} - T_{1}} \right)\frac{{2\sqrt{\tau}} - 1}{2\;\tau}}}$is the optimal target gray level. Otherwise,

$P_{0} = \frac{T_{2} + T_{1}}{2}$is the optimal target gray level. (Eq. 46)Physical Meaning (Graphical Interpretation)

The previous section focuses entirely on the analytic derivation. Now welook at the physical meaning of the solution from the Bossung plot.

In the Focus-Exposure plot, for each evaluation point, the curvecomposed of Focus and Exposure pairs leading to a fixed gray level isparabolic. More specifically, for fixed gray level K, let us analyze thecontour R(ƒ,ε)=K. From (Eq.44), the relationship between ε and ƒ on thiscontour is

$ɛ = {\frac{1}{V}\left( {{f^{2} \cdot P_{b}} - K + P_{0}} \right)}$which is parabolic. This is exemplified in FIG. 8. The two curvescorresponding to R(ƒ,ε)=T₁ and R(ƒ,ε)=T₂ are apparently parabolic.

We again assume that P_(a)=0 and P≥0. As we have stated, we want tomaximize the minimum value of (ƒ/σ_(ƒ))²+(ε/σ_(ε))² on the contours ofR(ƒ,ε)=T₁ and R(ƒ/ε)=T₂. Since (ƒ/σ_(ƒ))²+(ε/σ_(ε))² represents anellipse, this optimization can be graphically interpreted as thefollowing steps:

-   -   1. for each P₀, we look for the ellipse that is tangent to the        two curves R(ƒ,ε)=T₁ and R(ƒ,ε)=T₂.    -   2. Find the optimal P₀ that maximizes the ellipse

Since P_(b)≥0, both curves tilt up when |ƒ| increases. For the casewhere

$P_{0} = \frac{T_{2} + T_{1}}{2}$is the optimal gray level, it is apparent that the ellipse should betangent to R(ƒ,ε)=T₂ at the lowest point, which corresponds ƒ=0 and

$ɛ = {{{\frac{1}{V}\left( {{f^{2} \cdot P_{b}} - K + P_{0}} \right)}❘_{{f = 0},{K = T_{2}},{P_{0} = {{({T_{1} + T_{2}})}\text{/}2}}}} = {\frac{T_{1} - T_{2}}{2\; V} < 0.}}$And the equation of that ellipse is

${{\frac{f^{2}}{\sigma_{f}^{2}} + \frac{ɛ^{2}}{\sigma_{ɛ}^{2}}} = \Omega},{{{where}\mspace{14mu}\Omega} = \frac{\left( {T_{1} - T_{2}} \right)^{2}}{4\; V^{2}\sigma_{ɛ}^{2}}}$

We notice that both the contour and the ellipse reach their lowest pointhere. In addition, the second-order derivative of

$ɛ = {\frac{1}{V}\left( {{f^{2} \cdot P_{b}} - K + P_{0}} \right)}$with respect to ƒ is ε₁″(ƒ)=2P_(b)/V and the second-order derivative ofthe ellipse with respect to ƒ for negative ε is

$\begin{matrix}{{ɛ_{2}^{''}(f)} = {{- \left( {\sigma_{ɛ}\sqrt{\Omega - \frac{f^{2}}{\sigma_{f}^{2}}}} \right)^{''}} = {\left( \frac{\sigma_{ɛ}f}{\sigma_{f}\sqrt{{\Omega\sigma}_{f}^{2} - f^{2}}} \right)^{\prime} = \frac{{\Omega\sigma}_{f}\sigma_{g}}{\left( {{\Omega\sigma}_{f}^{2} - f^{2}} \right)^{3/2}}}}} & \left( {{Eq}.\mspace{14mu} 47} \right)\end{matrix}$

Since ε₂″(ƒ) is an increasing function for 0≤ƒ≤√{square root over(Ωσ_(ƒ) ²)} and ε₁″(ƒ)=2P_(b)/V is a constant, thus if and only ifε₂″(ƒ)|_(ƒ=0)≥ε₁″(ƒ), the ellipse tilts up faster than the paraboliccontour and they are tangent at the lowest point. ε₂″(ƒ)|_(ƒ=0)≥ε₁″(ƒ)leads to the following conclusions:

-   -   When

${\tau = {\frac{\left( {T_{2} - T_{1}} \right)P_{b}\sigma_{f}^{2}}{V^{2}\sigma_{ɛ}^{2}} \leq 1}},{P_{0} = \frac{T_{2} + T_{1}}{2}}$is the optimal target gray level, which is consistent with the previousanalytical analysis, as shown in FIG. 8. As can be seen in FIG. 8, thecontours for R(ƒ,ε)=T₁ and R(ƒ,ε)=T₂ are tangent to the ellipticalboundary of the process window at ƒ=0; therefore,

$P_{0} = \frac{T_{2} + T_{1}}{2}$is the optimal target gray level.

-   -   When τ>1, in order to maximize the process window, the optimal        P₀ has to shift away from

$\frac{T_{2} + T_{1}}{2}$and get closer to T₁. This is graphically depicted in FIG. 9, where theellipse (in dashed line) passing point 902 is no longer tangent to thecontour of R(ƒ,ε)=T₂; while the boundary (in solid line) of the optimalprocess window is tangent to the contour of R(ƒ,ε)=T₂ at point 904.Non-Best Focus PWM-OPC

When the nominal condition is off-focus, the formula is morecomplicated. In this case, the object function becomes

$\begin{matrix}\begin{matrix}{{Q\left( {f,ɛ,P_{0},K} \right)} = {\left( \frac{f}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ}{\sigma_{ɛ}} \right)^{2}}} \\{= {\frac{f^{2}}{\sigma_{f}^{2}} + \frac{\left( {{f \cdot P_{a}} + {f^{2} \cdot P_{b}} - K + P_{0}} \right)^{2}}{V^{2}\sigma_{ɛ}^{2}}}} \\{= {{\alpha\; f^{4}} + {\beta\; f^{3}} + {\left( {\chi + {\phi\left( {P_{0} - K} \right)}} \right)f^{2}} +}} \\{{{\varphi\left( {P_{0} - K} \right)}f} + {\gamma\left( {P_{0} - K} \right)}^{2}}\end{matrix} & \left( {{Eq}.\mspace{14mu} 48} \right)\end{matrix}$Where

${\alpha = \frac{P_{b}^{2}}{V^{2}\sigma_{ɛ}^{2}}},{\beta = \frac{2P_{a}P_{b}}{V^{2}\sigma_{ɛ}^{2}}},{\chi = {\frac{1}{\sigma_{f}^{2}} + \frac{P_{a}^{2}}{V^{2}\sigma_{ɛ}^{2}}}},{\phi = \frac{2P_{b}}{V^{2}\sigma_{ɛ}^{2}}},{\varphi = \frac{2P_{a}}{V^{2}\sigma_{ɛ}^{2}}},{\gamma = {\frac{1}{V^{2}\sigma_{ɛ}^{2}}.}}$This is a fourth-order polynomial function of ƒ. Again we have oneadditional constraint: T₁≤P₀≤T₂. The goal is to choose P₀ to maximizethe minimum object function value when K=T₁ and K=T₂.

As this formula cannot be simplified in the same way as (Eq.45), it ispossible that a close-form solution does not exist. However, with thehelp of geometrical analysis, we can use a simple iterative algorithm tocompute the optimal target gray level.

Without loss of generality, we assume that V>0, P_(a)>0, and P_(b)>0.Then the axis of

$ɛ = {{\frac{1}{V}\left( {{f \cdot P_{a}} + {f^{2} \cdot P_{b}} - K + P_{0}} \right)\mspace{14mu}{is}\mspace{14mu} f} = {{- \frac{P_{a}}{2P_{b}}} < 0.}}$

Suppose (ƒ(P₀),ε(P₀)) minimizes the object function Q(ƒ,ε,P₀,K), i.e.,

${Q\left( {P_{0},K} \right)} = {{Q\left( {{f\left( P_{0} \right)},{ɛ\left( P_{0} \right)},P_{0},K} \right)} = {\min\limits_{f}\;{{Q\left( {f,{\left( {{fP}_{a} + {f^{2}P_{b}} - K + P_{0}} \right)\text{/}V},P_{0},K} \right)}.}}}$

If K=T₁, then

${- \frac{P_{a}}{2P_{b}}} \leq {f\left( P_{0} \right)} \leq 0$and ε(P₀)=(ƒ(P₀)P_(a)+ƒ²(P₀)P_(b)−K+P₀)/V≥0. This can be easily seenfrom the graphical interpretation in FIG. 10. More specifically, FIG. 10shows that the contour for R(ƒ,ε)=T₁ is tangent to the ellipse at(ƒ(P₀),ε(P₀)), where

${- \frac{P_{a}}{2P_{b}}} \leq {f\left( P_{0} \right)} \leq 0$and ε(P₀)≥0. Further, as P₀ decreases, the ellipse is also “squeezed” bythe parabolic contour, thus Q(P₀,T₁) decreases. Lemma 1 in Appendix Aprovides a rigid proof. Also note that when P₀=T₁, this minimum objectfunction value achieves its minimum: 0.

Similarly, if K=T₂, then ƒ(P₀)≥0 andε(P₀)=ƒ(P₀)P_(a)+ƒ²(P₀)P_(b)−K+P₀)/V≤0, which is visualized in FIG. 11.

More specifically, FIG. 11 provides a graphical interpretation for(ƒ(P₀),ε(P₀)) when P₀≤T₂=K. The contour R(ƒ,ε)=K=T₂ is tangent to thetrue minimum ellipse (in solid line 1104) at (ƒ(P₀),ε(P₀)), whereƒ(P₀)≥0 and ε(P₀)≤0. Further, when P₀ increases from T₁ to T₂, thisminimum decreases. In other words with reference to FIG. 11, as P₀increases, the ellipse is also “squeezed” by the parabolic contour, thusQ(P₀,T₂) decreases. Again, these observations can be shown using rigidanalytical proof (See Lemma 2 in Appendix B). Also note that when P₀=T₂,this minimum object function value achieves its minimum: 0.

Thus, it becomes apparent that the necessary and sufficient conditionfor P₀ to be optimal is Q(P₀,T₁)=Q(P₀,T₂). This is exemplified by thetwo curves in FIG. 7 (though the formulas may be different, the basiccurve shapes are similar). Since Q(P₀,T₁)−Q(P₀,T₂) increasesmonotonically from a negative number to a positive number as P₀increases from T₁ to T₂, there exists a unique root for P₀∈[T₁,T₂],which can be found efficiently using standard root finding algorithmssuch as bisection method or Newton's method.

The remaining problem is how to compute Q(P₀,K). From (Eq.48),Q(ƒ,ε,P₀,K) is indeed a 4^(th)-order polynomial of defocus ƒ, i.e.,Q(ƒ,ε,P₀,K)=αƒ⁴+βƒ³+(χ+ϕ(P₀−K))ƒ²+φ(P₀−K)ƒ+γ(P₀−K)². In order todetermine the minimum, we take its derivative:Q′(ƒ,ε,P ₀ ,K)=4αƒ³+3βƒ²+2(χ+ϕ(P ₀ −K))ƒ+φ(P ₀ −K)  (Eq. 49)

If the derivative is set to 0, the root(s) will be the potential defocusvalue that minimizes Q(ƒ,ε,P₀,K). It is well known there exists ananalytical solution to the cubic equation. However, there are three(complex) roots for the cubic equation, which one achieves the globalminimum? We can find all three roots and evaluate Q(ƒ,ε,P₀,K) for allthe real roots and use the minimum one as Q(P₀,K). To further speed upthe process, we note that all coefficients of this equation are real,thus it either has three real roots or 1 real root plus 2 complementarycomplex roots. In the latter case, it is easy to choose the real root asƒ(P₀). And for K=T₁, with a little help from the graphical demonstration(e.g., FIG. 10), we can see that it falls into the latter case. For K=T₂if there are three real roots (as demonstrated in FIG. 11), then fromWeda's theorem, these three roots' sum is negative while product ispositive, so they must be two negative roots and one positive root. Aswe have shown, ƒ(P₀)≥0 for K=T₂, we can simply choose the positive root.As can be seen in FIG. 11, there are three real roots to the derivativeof (Eq.48). However, only the positive root (corresponding to theellipse in solid line 1104) achieves the global minimum. The other tworoots are negative and correspond to the ellipses with dashed boundary1102; though the contour R(ƒ,ε)=K=T₂ is also tangent to these twoellipses, they results in larger Q(ƒ,εP₀,T₂).

FIG. 12 is a flow diagram of Bisection Method to find optimal P₀ foreach evaluation point. Note that the equality check in this diagram isbased on the epsilon-absolute-error, i.e., we consider two numbers areequal if their absolute difference is less than a pre-defined smallnumber (epsilon).

More specifically, as illustrated in FIG. 12, the first step (Step 1201)in the process of finding the optimal P₀ by using bisection method ofthe given embodiment is to determine the specifics including P_(a),P_(b), σ_(ƒ), σ_(ε), V, T₁, and T₂, thus determine α, β, χ, Φ, φ, γ by(Eq.48). In this process, the equality checks (i.e, steps 1205, 1208,1212 that are described below) are based on the epsilon-absolute error,i.e., where two numbers are equal if their absolute difference is lessthan a pre-defined small number (epsilon). The next step in the process(Step 1202) is to let S₁=T₁ and S₂=T₂. In step 1203, Q(S₁,T₁) isevaluated by computing the roots of the derivative in (Eq. 49). One rootis chosen that truly minimizes Q(ƒ, ε, S₁, T₁) and then Q(ƒ, ε, S₁, T₁)is computed by (Eq.48), which is Q(S₁, T₁). In step 1204, Q(S₁, T₂) isevaluated, which is similar to step 1203. In step 1205, the differencebetween Q(S₁, T₁) and Q(S₁, T₂) is determined. If the difference isequal to zero, then the process proceeds to step 1206 and if thedifference does not equal zero, the process proceeds to step 1207. Instep 1207, the difference between Q(S₂, T₁) and Q(S₂, T₂) is determinedand the difference is compared to zero in step 1208. If the differenceis equal to zero, then the process proceeds to step 1209 and if thedifference does not equal zero, the process proceeds to step 1210. Instep 1210, S is set to be equal to (S₁+S₂)/2. In step 1211, thedifference between Q(S, T₁) and Q(S, T₂) is determined and thedifference is compared to zero in step 1212. If the difference is equalto zero, then the process proceeds to step 1213 and if the differencedoes not equal zero, the process proceeds to step 1214. In step 1214, ifthe maximum iteration number is reached or S-S₁ is small enough, thenthe process proceeds to step 1213, and if not, the process proceeds tostep 1215. In step 1215, Q(S, T₁)−Q(S, T₂) and Q(S₂, T₁)−Q(S₂, T₂) arecompared to determine if they have the same sign. If they do have thesame sign, the process proceeds to step 1216 and if they do not have thesame sign, the process proceeds to step 1217. In step 1216, S₂ isassigned with S, and the process flows to step 1210. In step 1217, S₁ isassigned with S, and the process flows to step 1210. In step 1206,P₀=S₁, and the process ends at step 1218 where P₀ is outputted. In step1209, P₀=S₂, and the process ends at step 1218 where P₀ is outputted. Instep 1213, P₀=S, and the process ends at step 1218 where P₀ isoutputted.

PWM-OPC Flow

The above algorithm can be used on top of most modernnominal-condition-OPC flows. The only extra module is the part thatspells out the individual target gray level P₀ for each evaluationpoint, then all existing flow of nominal-condition-OPC stays the same.In practice, since the layouts may be under dramatic modification duringthe OPC flow, the final coefficients in ΔR(ε,ƒ) may be far differentfrom those in ΔR(ε,ƒ) in the first few OPC iterations. For this reason,the flow can be implemented in the following way:

The nominal-condition-OPC is applied at the beginning, and the PWM-OPC(dynamic target gray level) is applied in the last few iterations.

This flow also inherently enables different CD control specification fordifferent layer and different features, by setting the gray levelcontrol range T_(r) differently for each evaluation point. For example,for evaluation points on critical features such as gates, we can setT_(r) very small or even to 0.

More specifically, as illustrated in FIG. 13, the first step (Step 1301)in the process of PWM-OPC flow, a design layout is provided without anyresolution enhancing techniques (RET) applied to the layout. At step1302, the layout is modified with RET and at step 1303, the post-RETlayout is provided. At step 1304, the mask image is provided. At step1305, an optical model including optical parameters is applied to themask image to compute a simulated aerial image (AI) at step 1306. Theaerial image is a simulation within the resist layer, which arises fromthe projection of light onto the substrate, refraction at the resistinterface and multiple reflections in the resist film stack. The lightintensity distribution (aerial image) is turned into a latent ‘resistimage’ by absorption of photons, which is further modified by diffusionprocesses and various loading effects. The optical model describes theproperties of the illumination and projection optics including, but notlimited to, NA-sigma (σ) settings, as well as, any particularillumination source shape. The optical properties of the photo-resistlayer coated on a substrate, such as refractive index, film thickness,propagation and polarization effects, may also be captured as part ofthe optical model. A resist model having resist parameters from step1307 is applied to the aerial image (AI) to compute a simulated resistimage (RI) at step 1308. The resist model describes the effects ofchemical processes which occur during resist exposure, post-exposurebake and development. At step 1309, resist image intensity is evaluatedat specific evaluation points. At step 1310, specific parametersincluding P_(a), P_(b), σ_(ƒ), σ_(ε), V and T₁ are determined from theoptical model and the resist model, and, at step 1311, an optimal P₀ isdetermined, as described above in relation to FIG. 12. At step 1312, acomparison is made with the determined optimal P₀ at step 1311 and theRI intensity at evaluation points identified at step 1309. At step 1313,if the resist image is good enough as determined by the comparison step1312, then the process ends and the mask image is outputted at step1314. If the resist image is determined to be not good enough, theprocess proceeds back to step 1302 and the process repeats until theresist image is determined to be good enough.

For comparison, a conventional Nominal Condition OPC flow does notinclude steps 1310 and 1311 as described above in FIG. 13, and thedynamically determined P₀ is typically replaced by a constant threshold.

Another practical consideration is the interpolation of image atevaluation points. In lithography simulation, all the images aregenerated on a grid. Typically, the evaluation point does not fallexactly on the grid.

In one embodiment, we compute the P_(a), P_(b), T₁, and T₂ directly onall evaluation points. Note that the computation of T₁ and T₂ depends onthe slope, thus depends on the location. In this embodiment, aerialimages can be computed at arbitrary locations using (Eq.1b), and thederivative aerial images can be computed in the same way from thederivatives of TCC using (Eq.1d). However, due the irregular positionsof the evaluation points, this method may be computationally expensive.

Alternatively, we apply interpolation to compute the image intensity atthe evaluation point. Suppose the computed resist image is R(iΔ), i=0,±1, ±2 . . . , where Δ is grid resolution. Then for an arbitrary x whichis not a multiple of Δ, we can apply a linear filter to R(iΔ):R(x)=R(iΔ)

h(x). In its simplest form, h(x) can be as simple as a linearcombination of two neighboring pixels, i.e., h(x)=1−|x|, for |x|<1 and 0elsewhere. We can also apply Nyquist reconstruction, where h(x) iscomposed of sinc functions, i.e., h(x)=sin(πx/Δ)/(πx). The choice ofh(x) depends on the applications to strike a balance betweencomputational cost and accuracy. Note that because of the separabilityof the filter, for two-dimensional image, we apply the filter on eachdimension independently.

Further, as the filter operation is linear, we notice thatR(ε,ƒ)=P ₀ +ε·R ₀+(1+ε)·ƒ·P _(a)+(1+ε)·ƒ² ·P _(b)is a linear function. Thus, suppose R(ε,ƒ) is computed on a grid, thenR(x)=R(ε,ƒ)

h(x)=[P ₀

h(x)]+ε·[R ₀

h(x)]+(1+ε)·ƒ·[P _(a)

h(x)]+(1+ε)·ƒ²·[P _(b)

h(x)]which implies that we can compute P_(a), P_(b), and etc on the gridfirst, then apply the interpolation filter on P_(a), P_(b), and etc tocompute the coefficients on evaluation points.Nominal Condition Optimization

Further, we can determine the best nominal condition based on denseprocess window sampling. After the OPC (either nominal condition OPC orPWM-OPC), we determine the common process window, which can be specifiedas:ƒ_(min)≤ƒ≤ƒ_(max) and ε_(min)(ƒ)≤ε≤ε_(max)(ƒ).

Then we may be able to shift the nominal condition by (ƒ₀,ε₀) tomaximize the probability of this process window. Again, the jointprobability density function of (ƒ,ε) is:

${P\left( {f,ɛ} \right)} = {\frac{1}{2{\pi\sigma}_{f}\sigma_{e}}{\exp\left( {{- \frac{1}{2}}\left( {\left( \frac{f}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ}{\sigma_{ɛ}} \right)^{2}} \right)} \right)}}$from (Eq.42). However, the process window becomesƒ_(min)−ƒ₀≤ƒ≤ƒ_(max)−ƒ₀ and ε_(min)(ƒ)−ε₀≤ε≤ε_(max)(ƒ)−ε₀.

The goal is to determine the optimal (ƒ₀,ε₀) to maximize

$\begin{matrix}{P_{PW} = {\overset{f_{\max} - f_{0}}{\int\limits_{f_{\min} - f_{0}}}{\overset{{ɛ_{\max}{(f)}} - ɛ_{0}}{\int\limits_{{ɛ_{\min}{(f)}} - ɛ_{0}}}{\frac{1}{2{\pi\sigma}_{f}\sigma_{ɛ}}{\exp\left( {{- \frac{1}{2}}\left( {\left( \frac{f}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ}{\sigma_{ɛ}} \right)^{2}} \right)} \right)}{dfd}\; ɛ}}}} \\{= {\overset{f_{\max}}{\int\limits_{f_{\min}}}{\overset{ɛ_{\max}{(f)}}{\int\limits_{ɛ_{\min}{(f)}}}{\frac{1}{2{\pi\sigma}_{f}\sigma_{ɛ}}{\exp\left( {{- \frac{1}{2}}\left( {\left( \frac{f + f_{0}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ + ɛ_{0}}{\sigma_{ɛ}} \right)^{2}} \right)} \right)}{dfd}\; ɛ}}}}\end{matrix}$

We can solve this optimization problem in the following way:

We first take its derivative with respect to ε₀, i.e.,

$\begin{matrix}{\frac{\partial P_{PW}}{\partial ɛ_{0}} = {- {\overset{f_{\max}}{\int\limits_{f_{\min}}}{\overset{ɛ_{\max}{(f)}}{\int\limits_{ɛ_{\min}{(f)}}}{\frac{1}{2{\pi\sigma}_{f}\sigma_{ɛ}}{\exp\left( {{- \frac{1}{2}}\left( {\left( \frac{f + f_{0}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ + ɛ_{0}}{\sigma_{ɛ}} \right)^{2}} \right)} \right)}\left( {ɛ +} \right.}}}}} \\{\left. ɛ_{0} \right){dfd}\; ɛ} \\{= {{- \frac{1}{2{\pi\sigma}_{f}\sigma_{ɛ}^{3}}}{\overset{f_{\max}}{\int\limits_{f_{\min}}}{{\exp\left( {{- \frac{1}{2}}\left( \left( \frac{f + f_{0}}{\sigma_{f}} \right)^{2} \right)} \right)}{df}\overset{ɛ_{\max}{(f)}}{\int\limits_{ɛ_{\min}{(f)}}}}}}} \\{{\exp\left( {{- \frac{1}{2}}\left( \left( \frac{ɛ + ɛ_{0}}{\sigma_{ɛ}} \right)^{2} \right)} \right)}\left( {ɛ + ɛ_{0}} \right)d\; ɛ} \\{= {\frac{1}{2{\pi\sigma}_{f}\sigma_{ɛ}}{\overset{f_{\max}}{\int\limits_{f_{\min}}}{{\exp\left( {{- \frac{1}{2}}\left( \left( \frac{f + f_{0}}{\sigma_{f}} \right)^{2} \right)} \right)}{df}\;\exp}}}} \\{\left( {{- \frac{1}{2}}\left( \left( \frac{ɛ + ɛ_{0}}{\sigma_{ɛ}} \right)^{2} \right)} \right)❘_{ɛ = {ɛ_{\min}(f)}}^{ɛ = {ɛ_{\max}{(f)}}}} \\{= {\frac{1}{2{\pi\sigma}_{f}\sigma_{ɛ}}{\overset{f_{\max}}{\int\limits_{f_{\min}}}{{\exp\left( {{- \frac{1}{2}}\left( \left( \frac{f + f_{0}}{\sigma_{f}} \right)^{2} \right)} \right)}\left\lbrack {{\exp\left( {{- \frac{1}{2}}\left( \left( \frac{{ɛ_{\max}(f)} + ɛ_{0}}{\sigma_{ɛ}} \right)^{2} \right)} \right)} -} \right.}}}} \\{\left. {\exp\left( {{- \frac{1}{2}}\left( \left( \frac{{ɛ_{\min}(f)} + ɛ_{0}}{\sigma_{ɛ}} \right)^{2} \right)} \right)} \right\rbrack{df}} \\{= 0.}\end{matrix}$we then use numerical method to find the root of ε₀ for each ƒ₀.Similarly, we find the root of

$\frac{\partial P_{PW}}{\partial f} = 0$for each ε₀. We do these two steps alternatively and iteratively untilthe roots converge.

Note that for special cases, there may exist simpler solutions. Forexample, as we have mentioned, if the nominal focus is at best focus,then ƒ₀=0 is always optimal, we only need to optimize for ε₀. The bestfocus itself may also be estimated through

$f_{0} = {- \frac{P_{a}}{2P_{b}}}$as a result of (Eq.44).

We may also be able to use mean of (ƒ,ε) inside the process window.

Further, we can combine PWM-OPC with this nominal conditionoptimization, as shown in FIG. 14. More specifically, as illustrated inFIG. 14, in the first step (Step 1401) in the process of OPC combinedwith the nominal condition optimization, a design layout is providedwithout any resolution enhancing techniques (RET) applied to the layout.At step 1402, the layout is modified with a nominal condition OPC orPWM-OPC and at step 1403, the post-RET layout is provided. At step 1404,a nominal condition optimization is performed. If the design is goodenough as determined at step 1405, the process is stopped and the designis outputted at step 1406. If the design is not good enough, the processreturns to step 1402 and repeated until an acceptable design isestablished.

In an embodiment, there is provided a method for maximizing a processwindow of a photolithographic process comprising: computing ananalytical function of process condition parameters that approximates aresist image value across a process window for each of a plurality ofevaluation points in a target pattern; determining a target value of theresist image value at nominal condition for each evaluation point basedon the analytical function, so that the process window is maximized; andusing the target value as an optimizing target for each evaluation pointin an optical proximity correction iteration.

In an embodiment, the process window comprises a range of a specificprocess parameter within which the resist critical dimension andtherefore resist image value is contained in a pre-defined range. In anembodiment, the specific process parameter includes focus and/orexposure. In an embodiment, a probability distribution of focus andexposure variation is Gaussian. In an embodiment, the target resistimage value at nominal condition is determined using numerical methodsincluding a bisection method. In an embodiment, an analytical targetresist image value is given for best focus. In an embodiment, theanalytical function comprises a polynomial function. In an embodiment,the target value for each evaluation point is recomputed at each of aplurality of optical proximity correction iterations. In an embodiment,the analytical function is approximated as a polynomial function offocus and exposure, R(ε,ƒ)=P₀+ƒ²·P_(b) having an associated threshold ofT+Vε for resist image contours, where P₀ represents resist imageintensity at best focus, ƒ represents a defocus value relative to thebest focus, ε represents an exposure change, V represents a scaling ofthe exposure change, and P_(b) represents second order derivativeimages. In an embodiment, the process window is a tolerance inexposure-defocus space over which critical dimension variations arewithin a predefined range around a nominal line width. In an embodiment,the process window is a parameter tolerance over which criticaldimension variations are within a predefined range around a nominal linewidth, wherein the parameter is selected from the group consisting ofdefocus, exposure dose, numerical aperture, sigma, aberration,polarization, and an optical constant of a resist layer.

In an embodiment, there is provided a method of maximizing lithographicprocess window for a target pattern, the method comprising: determiningan acceptable variation of resist image values around a fixed thresholdfor a plurality of evaluation points in the target pattern; computing anoptimal target value within the acceptable variation of resist imagevalues for each evaluation point for a nominal process window condition,so that the process parameter variation range is maximal subject to thecondition that the resist image value is kept within its acceptablevariation; and performing an edge movement process in an iterativemanner in an optical proximity correction process until an approximatedresist image value at each evaluation point converges to the optimaltarget value.

In an embodiment, the target level is dynamically set for eachevaluation point in each iteration to maximize the process window. In anembodiment, the target level is dynamically set for each evaluationpoint in each iteration to maximize the process window. In anembodiment, the process window comprises a range of a specificlithographic process parameter. In an embodiment, the specificlithographic process parameter includes focus and/or exposure. In anembodiment, the method further comprises computing an analyticalfunction that approximates the resist image value across the processwindow for each of the plurality of evaluation points in the targetpattern. In an embodiment, the analytical function comprises apolynomial function.

In an embodiment, there is provided a method for maximizing a processwindow associated with a lithography process for a target pattern, themethod comprising: determining an optimal target gray level of a resistimage value for each of a plurality of evaluation points in the targetpattern based on an analytical function so as to maximize the processwindow; using the target gray level value as the optimization target forthe resist image value for each evaluation point in an optical proximitycorrection iteration; and determining the best edge movement amount ofthe optical proximity correction iteration so that the resulting resistimage value equals to the target gray level.

In an embodiment, there is provided a method for maximizing a processwindow associated with a lithographic process for a target pattern, themethod comprising: determining an optimal target gray level of a resistimage value at a nominal process condition for each of a plurality ofevaluation points in the target pattern based on an analytical functionso as to maximize the process window for a given nominal condition;performing an edge movement process in an iterative manner in an opticalproximity correction process until an approximated resist image value ateach evaluation point converges to the optimal target gray level valueat the nominal process condition; determining the optimal nominalcondition for the resulting resist image from the optical proximitycorrection so as to maximize the process window; and alternativelyre-performing the determination of the optimal target gray level, theoptical proximity correction, and the determination of the optimalnominal condition until convergence on an optimal target pattern.

In an embodiment, there is provided a computer program productcomprising one or more computer-readable storage media havingcomputer-executable instructions for causing a computer to maximizelithographic process window for a target pattern, the instructionscausing the computer to perform a method comprising: computing ananalytical function of the process parameters that approximates a resistimage value across a process window for each of a plurality ofevaluation points in the target pattern; determining a target value ofthe resist image value at the nominal process condition for eachevaluation point based on the analytical function, so that the processwindow is maximized; and using the target value as an optimizing targetfor each evaluation point in an optical proximity correction iteration.

In an embodiment, there is provided a device manufactured by amanufacturing method comprising: computing an analytical function of theprocess parameters that approximates a resist image value across aprocess window for each of a plurality of evaluation points in a targetpattern; determining a target value of the resist image value at thenominal process condition for each evaluation point based on theanalytical function, so that the process window is maximized; using thetarget value as an optimizing target for each evaluation point in anoptical proximity correction iteration; and imaging the target patternafter one or more of the optical proximity correction iterations using alithographic apparatus.

In an embodiment, there is provided a method for maximizing a range ofacceptable values of at least one process condition parameter of aphotolithographic process comprising: using an analytical function ofthe at least one process condition parameter for approximating the valueof a resist image characteristic across a plurality of values of the atleast one process condition parameters for each of a plurality ofevaluation points in the target pattern; determining a target value ofthe resist image characteristic for each evaluation point based on theapproximations corresponding to a range of maximum width; and using thetarget value as an optimizing target for each evaluation point in anoptical proximity correction iteration.

In an embodiment, there is provided a computer program productcomprising one or more computer-readable storage media havingcomputer-executable instructions for causing a computer to perform amethod as described herein.

FIG. 15 is a block diagram that illustrates a computer system 100 whichcan assist in the simulation method disclosed herein. Computer system100 includes a bus 102 or other communication mechanism forcommunicating information, and a processor 104 coupled with bus 102 forprocessing information. Computer system 100 also includes a main memory106, such as a random access memory (RAM) or other dynamic storagedevice, coupled to bus 102 for storing information and instructions tobe executed by processor 104. Main memory 106 also may be used forstoring temporary variables or other intermediate information duringexecution of instructions to be executed by processor 104. Computersystem 100 further includes a read only memory (ROM) 108 or other staticstorage device coupled to bus 102 for storing static information andinstructions for processor 104. A storage device 110, such as a magneticdisk or optical disk, is provided and coupled to bus 102 for storinginformation and instructions.

Computer system 100 may be coupled via bus 102 to a display 112, such asa cathode ray tube (CRT) or flat panel or touch panel display fordisplaying information to a computer user. An input device 114,including alphanumeric and other keys, is coupled to bus 102 forcommunicating information and command selections to processor 104.Another type of user input device is cursor control 116, such as amouse, a trackball, or cursor direction keys for communicating directioninformation and command selections to processor 104 and for controllingcursor movement on display 112. This input device typically has twodegrees of freedom in two axes, a first axis (e.g., x) and a second axis(e.g., y), that allows the device to specify positions in a plane. Atouch panel (screen) display may also be used as an input device.

According to one embodiment of the invention, portions of the simulationprocess may be performed by computer system 100 in response to processor104 executing one or more sequences of one or more instructionscontained in main memory 106. Such instructions may be read into mainmemory 106 from another computer-readable medium, such as storage device110. Execution of the sequences of instructions contained in main memory106 causes processor 104 to perform the process steps described herein.One or more processors in a multi-processing arrangement may also beemployed to execute the sequences of instructions contained in mainmemory 106. In alternative embodiments, hard-wired circuitry may be usedin place of or in combination with software instructions to implementthe invention. Thus, embodiments of the invention are not limited to anyspecific combination of hardware circuitry and software.

The term “computer-readable medium” as used herein refers to any mediumthat participates in providing instructions to processor 104 forexecution. Such a medium may take many forms, including but not limitedto, non-volatile media, volatile media, and transmission media.Non-volatile media include, for example, optical or magnetic disks, suchas storage device 110. Volatile media include dynamic memory, such asmain memory 106. Transmission media include coaxial cables, copper wireand fiber optics, including the wires that comprise bus 102.Transmission media can also take the form of acoustic or light waves,such as those generated during radio frequency (RF) and infrared (IR)data communications. Common forms of computer-readable media include,for example, a floppy disk, a flexible disk, hard disk, magnetic tape,any other magnetic medium, a CD-ROM, DVD, any other optical medium,punch cards, paper tape, any other physical medium with patterns ofholes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip orcartridge, a carrier wave as described hereinafter, or any other mediumfrom which a computer can read.

Various forms of computer readable media may be involved in carrying oneor more sequences of one or more instructions to processor 104 forexecution. For example, the instructions may initially be borne on amagnetic disk of a remote computer. The remote computer can load theinstructions into its dynamic memory and send the instructions over atelephone line using a modem. A modem local to computer system 100 canreceive the data on the telephone line and use an infrared transmitterto convert the data to an infrared signal. An infrared detector coupledto bus 102 can receive the data carried in the infrared signal and placethe data on bus 102. Bus 102 carries the data to main memory 106, fromwhich processor 104 retrieves and executes the instructions. Theinstructions received by main memory 106 may optionally be stored onstorage device 110 either before or after execution by processor 104.

Computer system 100 also preferably includes a communication interface118 coupled to bus 102. Communication interface 118 provides a two-waydata communication coupling to a network link 120 that is connected to alocal network 122. For example, communication interface 118 may be anintegrated services digital network (ISDN) card or a modem to provide adata communication connection to a corresponding type of telephone line.As another example, communication interface 118 may be a local areanetwork (LAN) card to provide a data communication connection to acompatible LAN. Wireless links may also be implemented. In any suchimplementation, communication interface 118 sends and receiveselectrical, electromagnetic or optical signals that carry digital datastreams representing various types of information. Network link 120typically provides data communication through one or more networks toother data devices. For example, network link 120 may provide aconnection through local network 122 to a host computer 124 or to dataequipment operated by an Internet Service Provider (ISP) 126. ISP 126 inturn provides data communication services through the worldwide packetdata communication network, now commonly referred to as the “Internet”128. Local network 122 and Internet 128 both use electrical,electromagnetic or optical signals that carry digital data streams. Thesignals through the various networks and the signals on network link 120and through communication interface 118, which carry the digital data toand from computer system 100, are exemplary forms of carrier wavestransporting the information.

Computer system 100 can send messages and receive data, includingprogram code, through the network(s), network link 120, andcommunication interface 118. In the Internet example, a server 130 mighttransmit a requested code for an application program through Internet128, ISP 126, local network 122 and communication interface 118. Inaccordance with the invention, one such downloaded application providesfor the illumination optimization of the embodiment, for example. Thereceived code may be executed by processor 104 as it is received, and/orstored in storage device 110, or other non-volatile storage for laterexecution. In this manner, computer system 100 may obtain applicationcode in the form of a carrier wave.

FIG. 16 schematically depicts an exemplary lithographic projectionapparatus whose performance could be simulated utilizing the process ofpresent invention. The apparatus comprises:

-   -   a radiation system Ex, IL, for supplying a projection beam PB of        radiation. In this particular case, the radiation system also        comprises a radiation source LA;    -   a first object table (mask table) MT provided with a mask holder        for holding a mask MA (e.g., a reticle), and connected to first        positioning means for accurately positioning the mask with        respect to item PL;    -   a second object table (substrate table) WT provided with a        substrate holder for holding a substrate W (e.g., a        resist-coated silicon wafer), and connected to second        positioning means for accurately positioning the substrate with        respect to item PL;    -   a projection system (“lens”) PL (e.g., a refractive, catoptric        or catadioptric optical system) for imaging an irradiated        portion of the mask MA onto a target portion C (e.g., comprising        one or more dies) of the substrate W.

As depicted herein, the apparatus is of a reflective type (i.e., has areflective mask). However, in general, it may also be of a transmissivetype, for example (with a transmissive mask). Alternatively, theapparatus may employ another kind of patterning means as an alternativeto the use of a mask; examples include a programmable mirror array orLCD matrix.

The source LA (e.g., a mercury lamp or excimer laser) produces a beam ofradiation. This beam is fed into an illumination system (illuminator)IL, either directly or after having traversed conditioning means, suchas a beam expander Ex, for example. The illuminator IL may compriseadjusting means AM for setting the outer and/or inner radial extent(commonly referred to as σ-outer and σ-inner, respectively) of theintensity distribution in the beam. In addition, it will generallycomprise various other components, such as an integrator IN and acondenser CO. In this way, the beam PB impinging on the mask MA has adesired uniformity and intensity distribution in its cross-section.

It should be noted with regard to FIG. 16 that the source LA may bewithin the housing of the lithographic projection apparatus (as is oftenthe case when the source LA is a mercury lamp, for example), but that itmay also be remote from the lithographic projection apparatus, theradiation beam that it produces being led into the apparatus (e.g., withthe aid of suitable directing mirrors); this latter scenario is oftenthe case when the source LA is an excimer laser (e.g., based on KrF, ArFor F₂ lasing). The current invention encompasses at least both of thesescenarios.

The beam PB subsequently intercepts the mask MA, which is held on a masktable MT. Having traversed the mask MA, the beam PB passes through thelens PL, which focuses the beam PB onto a target portion C of thesubstrate W. With the aid of the second positioning means (andinterferometric measuring means IF), the substrate table WT can be movedaccurately, e.g. so as to position different target portions C in thepath of the beam PB. Similarly, the first positioning means can be usedto accurately position the mask MA with respect to the path of the beamPB, e.g., after mechanical retrieval of the mask MA from a mask library,or during a scan. In general, movement of the object tables MT, WT willbe realized with the aid of a long-stroke module (coarse positioning)and a short-stroke module (fine positioning), which are not explicitlydepicted in FIG. 16. However, in the case of a wafer stepper (as opposedto a step-and-scan tool) the mask table MT may just be connected to ashort stroke actuator, or may be fixed.

The depicted tool can be used in two different modes: In step mode, themask table MT is kept essentially stationary, and an entire mask imageis projected in one go (i.e., a single “flash”) onto a target portion C.The substrate table WT is then shifted in the x and/or y directions sothat a different target portion C can be irradiated by the beam PB; Inscan mode, essentially the same scenario applies, except that a giventarget portion C is not exposed in a single “flash”. Instead, the masktable MT is movable in a given direction (the so-called “scandirection”, e.g., the y direction) with a speed v, so that theprojection beam PB is caused to scan over a mask image; concurrently,the substrate table WT is simultaneously moved in the same or oppositedirection at a speed V=Mv, in which M is the magnification of the lensPL (typically, M=¼ or ⅕). In this manner, a relatively large targetportion C can be exposed, without having to compromise on resolution.

The concepts disclosed herein may simulate or mathematically model anygeneric imaging system for imaging sub wavelength features, and may beespecially useful with emerging imaging technologies capable ofproducing wavelengths of an increasingly smaller size. Emergingtechnologies already in use include EUV (extreme ultra violet)lithography that is capable of producing a 193 nm wavelength with theuse of a ArF laser, and even a 157 nm wavelength with the use of aFluorine laser. Moreover, EUV lithography is capable of producingwavelengths within a range of 20-5 nm by using a synchrotron or byhitting a material (either solid or a plasma) with high energy electronsin order to produce photons within this range. Because most materialsare absorptive within this range, illumination may be produced byreflective mirrors with a multi-stack of Molybdenum and Silicon. Themulti-stack mirror has a 40 layer pairs of Molybdenum and Silicon wherethe thickness of each layer is a quarter wavelength. Even smallerwavelengths may be produced with X-ray lithography. Typically, asynchrotron is used to produce an X-ray wavelength. Since most materialis absorptive at x-ray wavelengths, a thin piece of absorbing materialdefines where features would print (positive resist) or not print(negative resist).

While the concepts disclosed herein may be used for imaging on asubstrate such as a silicon wafer, it shall be understood that thedisclosed concepts may be used with any type of lithographic imagingsystems, e.g., those used for imaging on substrates other than siliconwafers.

The attached Appendices form part of the present disclosure and areincorporated herein by reference in their entirety.

Although the present invention has been particularly described withreference to the preferred embodiments thereof, it should be readilyapparent to those of ordinary skill in the art that changes andmodifications in the form and details may be made without departing fromthe spirit and scope of the invention. It is intended that the appendedclaims encompass such changes and modifications.

APPENDIX A

Lemma 1. If ƒ(P₀) and ε(P₀)=(ƒ(P₀)P_(a)+ƒ₂(P₀)P_(b)−K+P₀)/V minimizeQ(ƒ,ε,P₀,K) over all possible (ƒ,ε) such that ε=(ƒP_(a)+ƒ²P_(b)−K+P₀)/V,i.e.,

${Q\left( {P_{0},K} \right)} = {{Q\left( {{f\left( P_{0} \right)},{ɛ\left( P_{0} \right)},P_{0},K} \right)} = {\min\limits_{f}\;{{Q\left( {f,{\left( {{fP}_{a} + {f^{2}P_{b}} - K + P_{0}} \right)\text{/}V},P_{0},K} \right)}.}}}$for P>0, P_(b)>0, and P₀≥K, then

${- \frac{P_{a}}{2P_{b}}} \leq {f\left( P_{0} \right)} \leq 0$and ε(P₀)=ε(ƒ(P₀)P_(a)+ƒ²(P₀)P_(b)−K+P₀)/V≥0.Further, as P₀ decreases, the minimum Q(P₀,K) also decreases.Proof: 1) First we show that for any (ƒ₁,ε₁) such that

${{f_{1} < {{- \frac{P_{a}}{2P_{b}}}\mspace{14mu}{and}\mspace{14mu} ɛ_{1}}} = {\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)}},$there exists a pair (ƒ₂,ε₂) such that

${f_{2} > {- \frac{P_{a}}{2P_{b}}}},{ɛ_{2} = {\frac{1}{V}\left( {{f_{2} \cdot P_{a}} + {f_{2}^{2} \cdot P_{b}} - K + P_{0}} \right)}},$and Q(ƒ₁,ε₁,P₀,K)>Q(ƒ₂,ε₂,P₀,K). Thus, ƒ(P₀) must be greater than

$- \frac{P_{a}}{2P_{b}}$(otherwise, it cannot minimize Q(ƒ,ε,P₀,K)). To show this, let

$f_{2} = {{{- \frac{P_{a}}{P_{b}}} - f_{1}} > {- {\frac{P_{a}}{2P_{b}}.}}}$Further,

$\begin{matrix}{{{f_{2}^{2} - f_{1}^{2}} = {{\frac{P_{a}^{2}}{P_{b}^{2}} + {\frac{2P_{a}}{P_{b}}f_{1}}} = {{\frac{2P_{a}}{P_{b}}\left( {f_{1} + \frac{P_{a}}{2P_{b}}} \right)} < 0}}},{while}} & \; \\\begin{matrix}{ɛ_{2} = {\frac{1}{V}\left( {{f_{2} \cdot P_{a}} + {f_{2}^{2} \cdot P_{b}} - K + P_{0}} \right)}} \\{= {\frac{P_{b}}{V}\left( {\left( {f_{2} + \frac{P_{a}}{2P_{b}}} \right)^{2} - \frac{P_{a}^{2}}{4P_{b}^{2}} + \frac{P_{0} - K}{P_{b}}} \right)}} \\{= {\frac{P_{b}}{V}\left( {\left( {{- f_{1}} - \frac{P_{a}}{2P_{b}}} \right)^{2} - \frac{P_{a}^{2}}{4P_{b}^{2}} + \frac{P_{0} - K}{P_{b}}} \right)}} \\{= {\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)}} \\{= ɛ_{1}}\end{matrix} & \;\end{matrix}$Therefore,

${Q\left( {f_{1},ɛ_{1},P_{0},K} \right)} = {{{\left( \frac{f_{1}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ_{1}}{\sigma_{ɛ}} \right)^{2}} > {\left( \frac{f_{2}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ_{2}}{\sigma_{ɛ}} \right)^{2}}} = {{Q\left( {f_{2},ɛ_{2},P_{0},K} \right)}.}}$2) Second, we show that for any (ƒ₁,ε₁) such that ƒ₁>0 and

${ɛ_{1} = {\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)}},$Q(ƒ₁,ε₁,P₀,K)>Q(ƒ₂,ε₂,P₀,K), where ƒ₂=0 and ε₂=(P₀−K)/V. Thus, ƒ(P₀)must be less than or equal to 0. To show this, we notice that sinceƒ₁>0, then

$\begin{matrix}{ɛ_{1} = {\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)}} \\{= {{\frac{P_{b}}{V}\left( {\left( {f_{1} + \frac{P_{a}}{2P_{b}}} \right)^{2} - \frac{P_{a}^{2}}{4P_{b}^{2}} + \frac{P_{0} - K}{P_{b}}} \right)} >}} \\{\frac{P_{b}}{V}\left( {\left( {0 + \frac{P_{a}}{2P_{b}}} \right)^{2} - \frac{P_{a}^{2}}{4P_{b}^{2}} + \frac{P_{0} - K}{P_{b}}} \right)} \\{= {{\frac{1}{V}\left( {P_{0} - K} \right)} = ɛ_{2}}} \\{{\geq 0},}\end{matrix}$The last inequality follows since P₀≥K.Therefore,

${Q\left( {f_{1},ɛ_{1},P_{0},K} \right)} = {{{\left( \frac{f_{1}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ_{1}}{\sigma_{ɛ}} \right)^{2}} > {\left( \frac{f_{2}}{\sigma_{f}} \right)^{2} + \left( \frac{{\overset{.}{ɛ}}_{2}}{\sigma_{ɛ}} \right)^{2}}} = {{Q\left( {f_{2},ɛ_{2},P_{0},K} \right)}.}}$3) Now we have shown that

${- \frac{P_{a}}{2P_{b}}} \leq {f\left( P_{0} \right)} \leq 0.$The next part is to show that ε(P₀)=(ƒ(P₀)P_(a)+ƒ²(P₀)P_(b)−K+P₀)/V≥0.Obviously, if Δ=P_(a) ²−4P_(b)(P₀−K)≤0, then for any

$f,{ɛ = {{\left( {{fP}_{a} + {f^{2}P_{b}} - K + P_{0}} \right)/V} = {{\frac{P_{b}}{V}\left( {f + \frac{P_{a}}{2P_{b}}} \right)^{2}} - \frac{\Delta}{4P_{b}V}}}}$is always non-negative, which proves the lemma. Otherwise, Δ=P_(a)²−4P_(b)(P₀−K)>0. Let

${f^{*} = \frac{{- P_{a}} + \sqrt{\Delta}}{2P_{b}}},$then for any (ƒ₁,ε₁) such that

${{- \frac{P_{a}}{2P_{b}}} \leq f_{1} < f^{*}},$we have

${ɛ_{1} = {{\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)} < 0}},$and it is apparent that

${{Q\left( {f_{1},ɛ_{1},P_{0},K} \right)} = {{{\left( \frac{f_{1}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ_{1}}{\sigma_{ɛ}} \right)^{2}} > {\left( \frac{f^{*}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ^{*}}{\sigma_{ɛ}} \right)^{2}}} = {Q\left( {f^{*},ɛ^{*},P_{0},K} \right)}}},\mspace{20mu}{where}$$\mspace{20mu}{ɛ^{*} = {{\frac{1}{V}\left( {{f^{*} \cdot P_{a}} + {f^{*2} \cdot P_{b}} - K + P_{0}} \right)} = 0.}}$Thus, ε(P₀) must be greater than 0.4) Finally, we will show that as P₀ increases, the minimum Q(P₀,K) alsoincreases.

For any K≤P₀′<P₀, define ƒ₁(P₀′)=kƒ(P₀) and ε₁(P₀′)=kε(P₀), where k isthe root of equation H(k)=k²ƒ² (P₀)P_(b)+k(K−P₀−ƒ² (P₀)P_(b))+P₀′−K=0which falls between 0 and 1. Before moving forward, we should show thatthere exists a root in [0, 1) for this equation, which can be proved bythe facts that H(k=0)=P₀′−K≥0 and H(k=1)=P₀′−P₀<0.

Note that from the definition of k,

$\begin{matrix}{{ɛ_{1}\left( P_{0}^{\prime} \right)} = {k\;{ɛ\left( P_{0} \right)}}} \\{= {\frac{k}{V}\left( {{{f\left( P_{0} \right)} \cdot P_{a}} + {{f^{2}\left( P_{0} \right)} \cdot P_{b}} - K + P_{0}} \right)}} \\{= {\frac{1}{V}\left( {{{{kf}\left( P_{0} \right)} \cdot P_{a}} + {k^{2}{{f^{2}\left( P_{0} \right)} \cdot P_{b}}} - K + P_{0}^{\prime}} \right)}} \\{= {\frac{1}{V}\left( {{{f_{1}\left( P_{0}^{\prime} \right)} \cdot P_{a}} + {{f_{1}^{2}\left( P_{0}^{\prime} \right)} \cdot P_{b}} - K + P_{0}^{\prime}} \right)}}\end{matrix}$

Therefore, (ƒ₁(P₀′),ε₁(P₀′)) is on the curve

$ɛ = {\frac{1}{V}{\left( {{f \cdot P_{a}} + {f^{2} \cdot P_{b}} - K + P_{0}^{\prime}} \right).}}$As a result,

${Q\left( {{f\left( P_{0} \right)},{ɛ\left( P_{0} \right)},P_{0},K} \right)} = {{\frac{1}{k^{2}}{Q\left( {{f_{1}\left( P_{0}^{\prime} \right)},{ɛ_{1}\left( P_{0}^{\prime} \right)},P_{0}^{\prime},K} \right)}} > {Q\left( {{f_{1}\left( P_{0}^{\prime} \right)},{ɛ_{1}\left( P_{0}^{\prime} \right)},P_{0}^{\prime},K} \right)} \geq {Q\left( {{f\left( P_{0}^{\prime} \right)},{ɛ\left( P_{0}^{\prime} \right)},P_{0}^{\prime},K} \right)}}$The first inequality follows since 0≤k<l; and the last inequalityfollows as (ƒ(P₀′),ε(P₀′)) minimizes the object function Q(ƒ,ε,P₀′,K)over all possible (ƒ,ε) such that

$ɛ = {\frac{1}{V}{\left( {{f \cdot P_{a}} + {f^{2} \cdot P_{b}} - K + P_{0}^{\prime}} \right).}}$Thus Q(P₀,K) decreases when P₀ decreases, which concludes the proof.

APPENDIX B

Lemma 2. If ƒ(P₀) and ε(P₀)=(ƒ(P₀)P_(a)+ƒ²(P₀)P_(b)−K+P₀)/V minimizeQ(ƒ,ε,P₀,K) over all possible (ƒ,ε) such that ε=(ƒP_(a)+ƒ²P_(b)−K+P₀)/V,i.e.,

${Q\left( {P_{0},K} \right)} = {{Q\left( {{f\left( P_{0} \right)},{ɛ\left( P_{0} \right)},P_{0},K} \right)} = {\min\limits_{f}\;{{Q\left( {f,{\left( {{fP}_{a} + {f^{2}P_{b}} - K + P_{0}} \right)\text{/}V},P_{0},K} \right)}.}}}$for P_(a)>0, P_(b)>0, and P₀≤K, thenƒ(P ₀)≥0 and ε(P ₀)=(ƒ(P ₀)P _(a)+ƒ²(P ₀)P _(b) −K+P ₀)/V≤0.

Further, as P₀ increases, the minimum Q(P₀,K) decreases.

Proof: 1) We first show that ε(P₀)=(ƒ(P₀)P_(a)+ƒ²(P₀)P_(b)−K+P₀)/Vcannot be positive. This is from the observation that the equation

$ɛ = {{\frac{1}{V}\left( {{f \cdot P_{a}} + {f^{2} \cdot P_{b}} - K + P_{0}} \right)} = 0}$has two real roots as Δ=P_(a) ²−4P_(b)(P₀−K)>0. These two roots are

${f_{1}^{*} = \frac{{- P_{a}} + \sqrt{\Delta}}{2P_{b}}},$which is positive; and

${f_{2}^{*} = {\frac{{- P_{a}} - \sqrt{\Delta}}{2P_{b}} < \frac{- P_{a}}{2P_{b}}}},$which is negative.

$ɛ = {{\frac{1}{V}\left( {{f \cdot P_{a}} + {f^{2} \cdot P_{b}} - K + P_{0}} \right)} > 0}$if and only if ƒ>ƒ₁* or ƒ<ƒ₂*. It is straightforward to show that forany (ƒ₁,ε₁) such that ƒ₁>ƒ₁*>0 and

${ɛ_{1} = {{\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)} > 0}},{{Q\left( {f_{1},ɛ_{1},P_{0},K} \right)} = {{\left( \frac{f_{1}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ_{1}}{\sigma_{ɛ}} \right)^{2}} > {{Q\left( {f_{1}^{*},0,P_{0},K} \right)}.}}}$Similarly, for any (ƒ₁,ε₁) such that ƒ₁<ƒ₂*<0 and

${ɛ_{1} = {{\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)} > 0}},{{Q\left( {f_{1},ɛ_{1},P_{0},K} \right)} = {{\left( \frac{f_{1}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ_{1}}{\sigma_{ɛ}} \right)^{2}} > {{Q\left( {f_{2}^{*},0,P_{0},K} \right)}.}}}$Thus, ε(P₀) must be less than 0 and ƒ₂*≤ƒ(P₀)≤ƒ₁*.2) Next we show that for any (ƒ₁,ε₁) such that

${{f_{1} < {{- \frac{P_{a}}{2P_{b}}}\mspace{14mu}{and}\mspace{20mu} ɛ_{1}}} = {\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)}},$there exists a pair (ƒ₂,ε₂) such that

${f_{2} > {- \frac{P_{a}}{2P_{b}}}}\mspace{11mu},{ɛ_{2} = {\frac{1}{V}\left( {{f_{2} \cdot P_{a}} + {f_{2}^{2} \cdot P_{b}} - K + P_{0}} \right)}}$and Q(ƒ₁,ε₁,P₀,K)>Q(ƒ₂,ε₂,P₀,K). Thus, ƒ(P₀) must be greater than

$- {\frac{P_{a}}{2P_{b}}.}$To show this, let

$f_{2} = {{{- \frac{P_{a}}{P_{b}}} - f_{1}} > {- {\frac{P_{a}}{2P_{b}}.}}}$Further,

${{f_{2}^{2} - f_{1}^{2}} = {{\frac{P_{a}^{2}}{P_{b}^{2}} + {\frac{2P_{a}}{P_{b}}f_{1}}} = {{\frac{2P_{a}}{P_{b}}\left( {f_{1} + \frac{P_{a}}{2P_{b}}} \right)} < 0}}},{while}$$\begin{matrix}{ɛ_{2} = {\frac{1}{V}\left( {{f_{2} \cdot P_{a}} + {f_{2}^{2} \cdot P_{b}} - K + P_{0}} \right)}} \\{= {\frac{P_{b}}{V}\left( {\left( {f_{2} + \frac{P_{a}}{2P_{b}}} \right)^{2} - \frac{P_{a}^{2}}{4P_{b}^{2}} + \frac{P_{0} - K}{P_{b}}} \right)}} \\{= {\frac{P_{b}}{V}\left( {\left( {{- f_{1}} - \frac{P_{a}}{2P_{b}}} \right)^{2} - \frac{P_{a}^{2}}{4P_{b}^{2}} + \frac{P_{0} - K}{P_{b}}} \right)}} \\{= {\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)}} \\{= ɛ_{1}}\end{matrix}$Therefore,

${Q\left( {f_{1},ɛ_{1},P_{0},K} \right)} = {{{\left( \frac{f_{1}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ_{1}}{\sigma_{ɛ}} \right)^{2}} > {\left( \frac{f_{2}}{\sigma_{f}} \right)^{2} + \left( \frac{ɛ_{2}}{\sigma_{ɛ}} \right)^{2}}} = {{Q\left( {f_{2},ɛ_{2},P_{0},K} \right)}.}}$3) Now we have shown that

${- \frac{P_{a}}{2P_{b}}} \leq {f\left( P_{0} \right)} \leq {f_{1}^{*}.}$The remaining part is to show that

${- \frac{P_{a}}{2P_{b}}} \leq {f\left( P_{0} \right)} < 0$is not possible. Since for any (ƒ₁,ε₁) such that

${f_{2}^{*} < {- \frac{P_{a}}{2P_{b}}} \leq f_{1} < 0 < f_{1}^{*}},{ɛ_{1} = {{\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)} < 0.}}$Then let ƒ₂=0 and ε₂=(P₀−K)/V, we notice that since

${{- \frac{P_{a}}{2P_{b}}} \leq f_{1} < 0},{then}$$\mspace{14mu}\begin{matrix}{ɛ_{1} = {\frac{1}{V}\left( {{f_{1} \cdot P_{a}} + {f_{1}^{2} \cdot P_{b}} - K + P_{0}} \right)}} \\{= {{\frac{P_{b}}{V}\left( {\left( {f_{1} + \frac{P_{a}}{2P_{b}}} \right)^{2} - \frac{P_{a}^{2}}{4P_{b}^{2}} + \frac{P_{0} - K}{P_{b}}} \right)} <}} \\{{\frac{P_{b}}{V}\left( {\left( {0 + \frac{P_{a}}{2P_{b}}} \right)^{2} - \frac{P_{a}^{2}}{4P_{b}^{2}} + \frac{P_{0} - K}{P_{b}}} \right)},} \\{= {{\frac{1}{V}\left( {P_{0} - K} \right)} = {ɛ_{2} \leq 0}}}\end{matrix}$

The last inequality follows since P₀≤K. Thus,Q(ƒ₁,ε₁,P₀,K)>Q(ƒ₂,ε₂,P₀,K), which implies that ƒ(P₀) must be greaterthan 0.

4) Finally, we will show that as P₀ increases, the minimum Q(P₀,K)decreases. For any P₀<P₀′≤K, define ƒ₁(P₀′)=kƒ(P₀) and ε₁(P₀′)=kε(P₀),where k is the root of equation H(k)=k²ƒ²(P₀)P_(b)+k(K−P₀−ƒ²(P₀)P_(b))+P₀′−K=0 which falls between 0 and 1. Before moving forward,we should show that there exists a root in [0, 1) for this equation,which can be proved by the facts that H(k=0)=P₀′−K≤0 andH(k=1)=P₀′−P₀>0.

Note that from the definition of k,

$\begin{matrix}{{ɛ_{1}\left( P_{0}^{\prime} \right)} = {k\;{ɛ\left( P_{0} \right)}}} \\{= {\frac{k}{V}\left( {{{f\left( P_{0} \right)} \cdot P_{a}} + {{f^{2}\left( P_{0} \right)} \cdot P_{b}} - K + P_{0}} \right)}} \\{= {\frac{1}{V}\left( {{{{kf}\left( P_{0} \right)} \cdot P_{a}} + {k^{2}{{f^{2}\left( P_{0} \right)} \cdot P_{b}}} - K + P_{0}^{\prime}} \right)}} \\{= {\frac{1}{V}\left( {{{f_{1}\left( P_{0}^{\prime} \right)} \cdot P_{a}} + {{f_{1}^{2}\left( P_{0}^{\prime} \right)} \cdot P_{b}} - K + P_{0}^{\prime}} \right)}}\end{matrix}$

Therefore, (ƒ(P₀′),ε₁(P₀′)) is on the curve

$ɛ = {\frac{1}{V}{\left( {{f \cdot P_{a}} + {f^{2} \cdot P_{b}} - K + P_{0}^{\prime}} \right).}}$As a result,

${Q\left( {{f\left( P_{0} \right)},{ɛ\left( P_{0} \right)},P_{0},K} \right)} = {{\frac{1}{k^{2}}{Q\left( {{f_{1}\left( P_{0}^{\prime} \right)},{ɛ_{1}\left( P_{0}^{\prime} \right)},P_{0}^{\prime},K} \right)}} > {Q\left( {{f_{1}\left( P_{0}^{\prime} \right)},{ɛ_{1}\left( P_{0}^{\prime} \right)},P_{0}^{\prime},K} \right)} \geq {Q\left( {{f\left( P_{0}^{\prime} \right)},{ɛ\left( P_{0}^{\prime} \right)},P_{0}^{\prime},K} \right)}}$

The first inequality follows since 0≤k<1; and the last inequalityfollows as (ƒ(P₀′),ε(P₀′)) minimizes the object function Q(ƒ,ε,P₀′,K)over all possible (ƒ,ε) such that

$ɛ = {\frac{1}{V}{\left( {{f \cdot P_{a}} + {f^{2} \cdot P_{b}} - K + P_{0}^{\prime}} \right).}}$Thus Q(P₀,K) decreases when P₀ increases, which concludes the proof.

What is claimed is:
 1. A method comprising: computing, using a computerhardware system, an analytical function of a process condition parameterthat approximates an aerial or resist image value across a processwindow for each of a plurality of evaluation points in a target patternfor a lithographic process; determining a target value of the imagevalue for each evaluation point based on the analytical function and anominal condition comprising a nominal value of the process conditionparameter, wherein determining the target value for each evaluationpoint is performed such that the process window about the nominal valueof the process condition parameter is increased or maximized; andperforming, by the computer hardware system, optical proximitycorrection on the target pattern, using the target value as anoptimizing target for each evaluation point, for physicallymanufacturing a physical patterning device based on the target patternor for physically performing the lithographic process on a substratebased on the target pattern.
 2. The method according to claim 1, whereinthe process window comprises a range of a specific process parameterwithin which a resist critical dimension, and therefore one or moreimage values, is contained in a pre-defined range.
 3. The methodaccording to claim 1, wherein the determining the target value of theimage value comprises using a numerical method.
 4. The method accordingto claim 1, wherein an analytical target image value is given for bestfocus.
 5. The method according to claim 1, wherein the computing theanalytical function comprises computing a polynomial function.
 6. Themethod according to claim 1, further comprising re-determining thetarget value for each evaluation point at each of a plurality of opticalproximity correction iterations.
 7. A method comprising: forming ananalytical function of a process condition parameter of a lithographicprocess to approximate a value of an aerial or resist imagecharacteristic; approximating values of the image characteristic acrossa plurality of values of the process condition parameter for each of aplurality of evaluation points in a target pattern using the analyticalfunction; determining, using a computer hardware system, a target valueof the image characteristic for each evaluation point based on theapproximated values of the image characteristic corresponding to amaximum range of values of the process condition parameter; andperforming, by the computer hardware system, optical proximitycorrection on the target pattern, using the target value as anoptimizing target for each evaluation point, for physicallymanufacturing a physical patterning device based on the target patternor for physically performing the lithographic process on a substratebased on the target pattern.
 8. The method according to claim 7, whereinthe maximum range of values of the process condition parametercorresponds to a process window comprising a range of a specific processparameter within which a critical dimension, and therefore imagecharacteristic value, is contained in a pre-defined range.
 9. The methodaccording to claim 7, wherein an analytical target image characteristicvalue is given for best focus.
 10. The method according to claim 7,wherein forming the analytical function comprises forming a polynomialfunction.
 11. The method according to claim 7, further comprisingre-determining the target value for each evaluation point at each of aplurality of optical proximity correction iterations.
 12. Anon-transitory computer-readable storage medium havingcomputer-executable instructions stored therein to cause a computer toat least: compute an analytical function of a process conditionparameter that approximates an aerial or resist image value across aprocess window for each of a plurality of evaluation points in a targetpattern; determine a target value of the image value for each evaluationpoint based on the analytical function and a nominal conditioncomprising a nominal value of the process condition parameter, whereindetermining the target value for each evaluation point is performed suchthat the process window about the nominal value of the process conditionparameter is increased or maximized; and perform optical proximitycorrection on the target pattern, using the target value as anoptimizing target for each evaluation point, for physicallymanufacturing a physical patterning device based on the target patternor for physically performing the lithographic process on a substratebased on the target pattern.
 13. The storage medium according to claim12, wherein the process window comprises a range of a specific processparameter within which a critical dimension, and therefore image value,is contained in a pre-defined range.
 14. The storage medium according toclaim 12, wherein the computer-executable instructions configured todetermine the target value of the image value are configured to use anumerical method.
 15. The storage medium according to claim 12, whereinan analytical target image value is given for best focus.
 16. Thestorage medium according to claim 12, wherein the computer-executableinstructions configured to compute the analytical function areconfigured to compute a polynomial function.
 17. The storage mediumaccording to claim 12, wherein the computer-executable instructions arefurther configured to cause a computer to re-compute the target valuefor each evaluation point at each of a plurality of optical proximitycorrection iterations.
 18. A non-transitory computer-readable storagemedium having computer-executable instructions stored therein to cause acomputer to at least: form an analytical function of a process conditionparameter of a lithographic process to approximate a value of an aerialor resist image characteristic; approximate values of the imagecharacteristic across a plurality of values of the process conditionparameter for each of a plurality of evaluation points in a targetpattern using the analytical function; determine a target value of theimage characteristic for each evaluation point based on the approximatedvalues of the image characteristic corresponding to a maximum range ofvalues of the process condition parameter; and perform optical proximitycorrection on the target pattern, using the target value as anoptimizing target for each evaluation point, for physicallymanufacturing a physical patterning device based on the target patternor for physically performing the lithographic process on a substratebased on the target pattern.
 19. The storage medium according to claim18, wherein the maximum range of values of the process conditionparameter correspond to a process window comprising a range of aspecific process parameter within which a critical dimension, andtherefore image characteristic value, is contained in a pre-definedrange.
 20. The storage medium according to claim 18, wherein ananalytical target image characteristic value is given for best exposuredose.